Data Persistence – ZODB

Data Persistence – ZODB ”; Previous Next ZODB (Zope object Database) is database for storing Python objects. It is ACID compliant – feature not found in NOSQL databases. The ZODB is also open source, horizontally scalable and schema-free, like many NoSQL databases. However, it is not distributed and does not offer easy replication. It provides persistence mechanism for Python objects. It is a part of Zope Application server, but can also be independently used. ZODB was created by Jim Fulton of Zope Corporation. It started as simple Persistent Object System. Its current version is 5.5.0 and is written completely in Python. using an extended version of Python”s built-in object persistence (pickle). Some of the main features of ZODB are − transactions history/undo transparently pluggable storage built-in caching multiversion concurrency control (MVCC) scalability across a network The ZODB is a hierarchical database. There is a root object, initialized when a database is created. The root object is used like a Python dictionary and it can contain other objects (which can be dictionary-like themselves). To store an object in the database, it’s enough to assign it to a new key inside its container. ZODB is useful for applications where data is hierarchical and there are likely to be more reads than writes. ZODB is an extension of pickle object. That”s why it can be processed through Python script only. To install latest version of ZODB let use pip utility − pip install zodb Following dependencies are also installed − BTrees==4.6.1 cffi==1.13.2 persistent==4.5.1 pycparser==2.19 six==1.13.0 transaction==2.4.0 ZODB provides following storage options − FileStorage This is the default. Everything stored in one big Data.fs file, which is essentially a transaction log. DirectoryStorage This stores one file per object revision. In this case, it does not require the Data.fs.index to be rebuilt on an unclean shutdown. RelStorage This stores pickles in a relational database. PostgreSQL, MySQL and Oracle are supported. To create ZODB database we need a storage, a database and finally a connection. First step is to have storage object. import ZODB, ZODB.FileStorage storage = ZODB.FileStorage.FileStorage(”mydata.fs”) DB class uses this storage object to obtain database object. db = ZODB.DB(storage) Pass None to DB constructor to create in-memory database. Db=ZODB.DB(None) Finally, we establish connection with the database. conn=db.open() The connection object then gives you access to the ‘root’ of the database with the ‘root()’ method. The ‘root’ object is the dictionary that holds all of your persistent objects. root = conn.root() For example, we add a list of students to the root object as follows − root[”students”] = [”Mary”, ”Maya”, ”Meet”] This change is not permanently saved in the database till we commit the transaction. import transaction transaction.commit() To store object of a user defined class, the class must be inherited from persistent.Persistent parent class. Advantages of Subclassing Subclassing Persistent class has its advantages as follows − The database will automatically track object changes made by setting attributes. Data will be saved in its own database record. You can save data that doesn’t subclass Persistent, but it will be stored in the database record of whatever persistent object references it. Non-persistent objects are owned by their containing persistent object and if multiple persistent objects refer to the same non-persistent subobject, they’ll get their own copies. Let use define a student class subclassing Persistent class as under − import persistent class student(persistent.Persistent): def __init__(self, name): self.name = name def __repr__(self): return str(self.name) To add object of this class, let us first set up the connection as described above. import ZODB, ZODB.FileStorage storage = ZODB.FileStorage.FileStorage(”studentdata.fs”) db = ZODB.DB(storage) conn=db.open() root = conn.root() Declare object an add to root and then commit the transaction s1=student(“Akash”) root[”s1”]=s1 import transaction transaction.commit() conn.close() List of all objects added to root can be retrieved as a view object with the help of items() method since root object is similar to built in dictionary. print (root.items()) ItemsView({”s1”: Akash}) To fetch attribute of specific object from root, print (root[”s1”].name) Akash The object can be easily updated. Since the ZODB API is a pure Python package, it doesn’t require any external SQL type language to be used. root[”s1”].name=”Abhishek” import transaction transaction.commit() The database will be updated instantly. Note that transaction class also defines abort() function which is similar to rollback() transaction control in SQL. Print Page Previous Next Advertisements ”;

Python Data Persistence – Plistlib Module

Python Data Persistence – Plistlib Module ”; Previous Next The plist format is mainly used by MAC OS X. These files are basically XML documents. They store and retrieve properties of an object. Python library contains plist module, that is used to read and write ”property list” files (they usually have .plist” extension). The plistlib module is more or less similar to other serialization libraries in the sense, it also provides dumps() and loads() functions for string representation of Python objects and load() and dump() functions for disk operation. Following dictionary object maintains property (key) and corresponding value − proplist = { “name” : “Ganesh”, “designation”:”manager”, “dept”:”accts”, “salary” : {“basic”:12000, “da”:4000, “hra”:800} } In order to write these properties in a disk file, we call dump() function in plist module. import plistlib fileName=open(”salary.plist”,”wb”) plistlib.dump(proplist, fileName) fileName.close() Conversely, to read back the property values, use load() function as follows − fp= open(”salary.plist”, ”rb”) pl = plistlib.load(fp) print(pl) Print Page Previous Next Advertisements ”;

Python Data Persistence – Marshal Module

Python Data Persistence – Marshal Module ”; Previous Next Object serialization features of marshal module in Python’s standard library are similar to pickle module. However, this module is not used for general purpose data. On the other hand, it is used by Python itself for Python’s internal object serialization to support read/write operations on compiled versions of Python modules (.pyc files). The data format used by marshal module is not compatible across Python versions. Therefore, a compiled Python script (.pyc file) of one version most probably won’t execute on another. Just as pickle module, marshal module also defined load() and dump() functions for reading and writing marshalled objects from / to file. dump() This function writes byte representation of supported Python object to a file. The file itself be a binary file with write permission load() This function reads the byte data from a binary file and converts it to Python object. Following example demonstrates use of dump() and load() functions to handle code objects of Python, which are used to store precompiled Python modules. The code uses built-in compile() function to build a code object out of a source string which embeds Python instructions. compile(source, file, mode) The file parameter should be the file from which the code was read. If it wasn’t read from a file pass any arbitrary string. The mode parameter is ‘exec’ if the source contains sequence of statements, ‘eval’ if there is a single expression or ‘single’ if it contains a single interactive statement. The compile code object is then stored in a .pyc file using dump() function. import marshal script = “”” a=10 b=20 print (”addition=”,a+b) “”” code = compile(script, “script”, “exec”) f=open(“a.pyc”,”wb”) marshal.dump(code, f) f.close() To deserialize, the object from .pyc file use load() function. Since, it returns a code object, it can be run using exec(), another built-in function. import marshal f=open(“a.pyc”,”rb”) data=marshal.load(f) exec (data) Print Page Previous Next Advertisements ”;

Python Data Persistence – File API

Python Data Persistence – File API ”; Previous Next Python uses built-in input() and print() functions to perform standard input/output operations. The input() function reads bytes from a standard input stream device, i.e. keyboard. The print() function on the other hand, sends the data towards standard output stream device i.e. the display monitor. Python program interacts with these IO devices through standard stream objects stdin and stdout defined in sys module. The input() function is actually a wrapper around readline() method of sys.stdin object. All keystrokes from the input stream are received till ‘Enter’ key is pressed. >>> import sys >>> x=sys.stdin.readline() Welcome to TutorialsPoint >>> x ”Welcome to TutorialsPointn” Note that, readline() function leave a trailing ‘n’ character. There is also a read() method which reads data from standard input stream till it is terminated by Ctrl+D character. >>> x=sys.stdin.read() Hello Welcome to TutorialsPoint >>> x ”HellonWelcome to TutorialsPointn” Similarly, print() is a convenience function emulating write() method of stdout object. >>> x=”Welcome to TutorialsPointn” >>> sys.stdout.write(x) Welcome to TutorialsPoint 26 Just as stdin and stdout predefined stream objects, a Python program can read data from and send data to a disk file or a network socket. They are also streams. Any object that has read() method is an input stream. Any object that has write() method is an output stream. The communication with the stream is established by obtaining reference to the stream object with built-in open() function. open() function This built-in function uses following arguments − f=open(name, mode, buffering) The name parameter, is name of disk file or byte string, mode is optional one-character string to specify the type of operation to be performed (read, write, append etc.) and buffering parameter is either 0, 1 or -1 indicating buffering is off, on or system default. File opening mode is enumerated as per table below. Default mode is ‘r’ Sr.No Parameters & Description 1 R Open for reading (default) 2 W Open for writing, truncating the file first 3 X Create a new file and open it for writing 4 A Open for writing, appending to the end of the file if it exists 5 B Binary mode 6 T Text mode (default) 7 + Open a disk file for updating (reading and writing) In order to save data to file it must be opened with ‘w’ mode. f=open(”test.txt”,”w”) This file object acts as an output stream, and has access to write() method. The write() method sends a string to this object, and is stored in the file underlying it. string=”Hello TutorialsPointn” f.write(string) It is important to close the stream, to ensure that any data remaining in buffer is completely transferred to it. file.close() Try and open ‘test.txt’ using any test editor (such as notepad) to confirm successful creation of file. To read contents of ‘test.txt’ programmatically, it must be opened in ‘r’ mode. f=open(”test.txt”,”r”) This object behaves as an input stream. Python can fetch data from the stream using read() method. string=f.read() print (string) Contents of the file are displayed on Python console. The File object also supports readline() method which is able to read string till it encounters EOF character. However, if same file is opened in ‘w’ mode to store additional text in it, earlier contents are erased. Whenever, a file is opened with write permission, it is treated as if it is a new file. To add data to an existing file, use ‘a’ for append mode. f=open(”test.txt”,”a”) f.write(”Python Tutorialsn”) The file now, has earlier as well as newly added string. The file object also supports writelines() method to write each string in a list object to the file. f=open(”test.txt”,”a”) lines=[”Java Tutorialsn”, ”DBMS tutorialsn”, ”Mobile development tutorialsn”] f.writelines(lines) f.close() Example The readlines() method returns a list of strings, each representing a line in the file. It is also possible to read the file line by line until end of file is reached. f=open(”test.txt”,”r”) while True: line=f.readline() if line==”” : break print (line, end=””) f.close() Output Hello TutorialsPoint Python Tutorials Java Tutorials DBMS tutorials Mobile development tutorials Binary mode By default, read/write operation on a file object are performed on text string data. If we want to handle files of different other types such as media (mp3), executables (exe), pictures (jpg) etc., we need to add ‘b’ prefix to read/write mode. Following statement will convert a string to bytes and write in a file. f=open(”test.bin”, ”wb”) data=b”Hello World” f.write(data) f.close() Conversion of text string to bytes is also possible using encode() function. data=”Hello World”.encode(”utf-8”) We need to use ‘rb’ mode to read binary file. Returned value of read() method is first decoded before printing. f=open(”test.bin”, ”rb”) data=f.read() print (data.decode(encoding=”utf-8”)) In order to write integer data in a binary file, the integer object should be converted to bytes by to_bytes() method. n=25 n.to_bytes(8,”big”) f=open(”test.bin”, ”wb”) data=n.to_bytes(8,”big”) f.write(data) To read back from a binary file, convert output of read() function to integer by from_bytes() function. f=open(”test.bin”, ”rb”) data=f.read() n=int.from_bytes(data, ”big”) print (n) For floating point data, we need to use struct module from Python’s standard library. import struct x=23.50 data=struct.pack(”f”,x) f=open(”test.bin”, ”wb”) f.write(data) Unpacking the string from read() function, to retrieve the float data from binary file. f=open(”test.bin”, ”rb”) data=f.read() x=struct.unpack(”f”, data) print (x)

Python Data Persistence – Shelve Module

Python Data Persistence – Shelve Module ”; Previous Next The shelve module in Python’s standard library provides simple yet effective object persistence mechanism. The shelf object defined in this module is dictionary-like object which is persistently stored in a disk file. This creates a file similar to dbm database on UNIX like systems. The shelf dictionary has certain restrictions. Only string data type can be used as key in this special dictionary object, whereas any picklable Python object can be used as value. The shelve module defines three classes as follows − Sr.No Shelve Module & Description 1 Shelf This is the base class for shelf implementations. It is initialized with dict-like object. 2 BsdDbShelf This is a subclass of Shelf class. The dict object passed to its constructor must support first(), next(), previous(), last() and set_location() methods. 3 DbfilenameShelf This is also a subclass of Shelf but accepts a filename as parameter to its constructor rather than dict object. The open() function defined in shelve module which return a DbfilenameShelf object. open(filename, flag=”c”, protocol=None, writeback=False) The filename parameter is assigned to the database created. Default value for flag parameter is ‘c’ for read/write access. Other flags are ‘w’ (write only) ‘r’ (read only) and ‘n’ (new with read/write). The serialization itself is governed by pickle protocol, default is none. Last parameter writeback parameter by default is false. If set to true, the accessed entries are cached. Every access calls sync() and close() operations, hence process may be slow. Following code creates a database and stores dictionary entries in it. import shelve s=shelve.open(“test”) s[”name”]=”Ajay” s[”age”]=23 s[”marks”]=75 s.close() This will create test.dir file in current directory and store key-value data in hashed form. The Shelf object has following methods available − Sr.No. Methods & Description 1 close() synchronise and close persistent dict object. 2 sync() Write back all entries in the cache if shelf was opened with writeback set to True. 3 get() returns value associated with key 4 items() list of tuples – each tuple is key value pair 5 keys() list of shelf keys 6 pop() remove specified key and return the corresponding value. 7 update() Update shelf from another dict/iterable 8 values() list of shelf values To access value of a particular key in shelf − s=shelve.open(”test”) print (s[”age”]) #this will print 23 s[”age”]=25 print (s.get(”age”)) #this will print 25 s.pop(”marks”) #this will remove corresponding k-v pair As in a built-in dictionary object, the items(), keys() and values() methods return view objects. print (list(s.items())) [(”name”, ”Ajay”), (”age”, 25), (”marks”, 75)] print (list(s.keys())) [”name”, ”age”, ”marks”] print (list(s.values())) [”Ajay”, 25, 75] To merge items of another dictionary with shelf use update() method. d={”salary”:10000, ”designation”:”manager”} s.update(d) print (list(s.items())) [(”name”, ”Ajay”), (”age”, 25), (”salary”, 10000), (”designation”, ”manager”)] Print Page Previous Next Advertisements ”;

Python Data Persistence – dbm Package

Python Data Persistence – dbm Package ”; Previous Next The dbm package presents a dictionary like interface DBM style databases. DBM stands for DataBase Manager. This is used by UNIX (and UNIX like) operating system. The dbbm library is a simple database engine written by Ken Thompson. These databases use binary encoded string objects as key, as well as value. The database stores data by use of a single key (a primary key) in fixed-size buckets and uses hashing techniques to enable fast retrieval of the data by key. The dbm package contains following modules − dbm.gnu module is an interface to the DBM library version as implemented by the GNU project. dbm.ndbm module provides an interface to UNIX nbdm implementation. dbm.dumb is used as a fallback option in the event, other dbm implementations are not found. This requires no external dependencies but is slower than others. >>> dbm.whichdb(”mydbm.db”) ”dbm.dumb” >>> import dbm >>> db=dbm.open(”mydbm.db”,”n”) >>> db[”name”]=Raj Deshmane” >>> db[”address”]=”Kirtinagar Pune” >>> db[”PIN”]=”431101” >>> db.close() The open() function allows mode these flags − Sr.No. Value & Meaning 1 ”r” Open existing database for reading only (default) 2 ”w” Open existing database for reading and writing 3 ”c” Open database for reading and writing, creating it if it doesn’t exist 4 ”n” Always create a new, empty database, open for reading and writing The dbm object is a dictionary like object, just as shelf object. Hence, all dictionary operations can be performed. The dbm object can invoke get(), pop(), append() and update() methods. Following code opens ”mydbm.db” with ”r” flag and iterates over collection of key-value pairs. >>> db=dbm.open(”mydbm.db”,”r”) >>> for k,v in db.items(): print (k,v) b”name” : b”Raj Deshmane” b”address” : b”Kirtinagar Pune” b”PIN” : b”431101” Print Page Previous Next Advertisements ”;

Python Data Persistence – CSV Module

Python Data Persistence – CSV Module ”; Previous Next CSV stands for comma separated values. This file format is a commonly used data format while exporting/importing data to/from spreadsheets and data tables in databases. The csv module was incorporated in Python’s standard library as a result of PEP 305. It presents classes and methods to perform read/write operations on CSV file as per recommendations of PEP 305. CSV is a preferred export data format by Microsoft’s Excel spreadsheet software. However, csv module can handle data represented by other dialects also. The CSV API interface consists of following writer and reader classes − writer() This function in csv module returns a writer object that converts data into a delimited string and stores in a file object. The function needs a file object with write permission as a parameter. Every row written in the file issues a newline character. To prevent additional space between lines, newline parameter is set to ””. The writer class has following methods − writerow() This method writes items in an iterable (list, tuple or string), separating them by comma character. writerows() This method takes a list of iterables, as parameter and writes each item as a comma separated line of items in the file. Example Following example shows use of writer() function. First a file is opened in ‘w’ mode. This file is used to obtain writer object. Each tuple in list of tuples is then written to file using writerow() method. import csv persons=[(”Lata”,22,45),(”Anil”,21,56),(”John”,20,60)] csvfile=open(”persons.csv”,”w”, newline=””) obj=csv.writer(csvfile) for person in persons: obj.writerow(person) csvfile.close() Output This will create ‘persons.csv’ file in current directory. It will show following data. Lata,22,45 Anil,21,56 John,20,60 Instead of iterating over the list to write each row individually, we can use writerows() method. csvfile=open(”persons.csv”,”w”, newline=””) persons=[(”Lata”,22,45),(”Anil”,21,56),(”John”,20,60)] obj=csv.writer(csvfile) obj.writerows(persons) obj.close() reader() This function returns a reader object which returns an iterator of lines in the csv file. Using the regular for loop, all lines in the file are displayed in following example − Example csvfile=open(”persons.csv”,”r”, newline=””) obj=csv.reader(csvfile) for row in obj: print (row) Output [”Lata”, ”22”, ”45”] [”Anil”, ”21”, ”56”] [”John”, ”20”, ”60”] The reader object is an iterator. Hence, it supports next() function which can also be used to display all lines in csv file instead of a for loop. csvfile=open(”persons.csv”,”r”, newline=””) obj=csv.reader(csvfile) while True: try: row=next(obj) print (row) except StopIteration: break As mentioned earlier, csv module uses Excel as its default dialect. The csv module also defines a dialect class. Dialect is set of standards used to implement CSV protocol. The list of dialects available can be obtained by list_dialects() function. >>> csv.list_dialects() [”excel”, ”excel-tab”, ”unix”] In addition to iterables, csv module can export a dictionary object to CSV file and read it to populate Python dictionary object. For this purpose, this module defines following classes − DictWriter() This function returns a DictWriter object. It is similar to writer object, but the rows are mapped to dictionary object. The function needs a file object with write permission and a list of keys used in dictionary as fieldnames parameter. This is used to write first line in the file as header. writeheader() This method writes list of keys in dictionary as a comma separated line as first line in the file. In following example, a list of dictionary items is defined. Each item in the list is a dictionary. Using writrows() method, they are written to file in comma separated manner. persons=[ {”name”:”Lata”, ”age”:22, ”marks”:45}, {”name”:”Anil”, ”age”:21, ”marks”:56}, {”name”:”John”, ”age”:20, ”marks”:60} ] csvfile=open(”persons.csv”,”w”, newline=””) fields=list(persons[0].keys()) obj=csv.DictWriter(csvfile, fieldnames=fields) obj.writeheader() obj.writerows(persons) csvfile.close() The persons.csv file shows following contents − name,age,marks Lata,22,45 Anil,21,56 John,20,60 DictReader() This function returns a DictReader object from the underlying CSV file. As, in case of, reader object, this one is also an iterator, using which contents of the file are retrieved. csvfile=open(”persons.csv”,”r”, newline=””) obj=csv.DictReader(csvfile) The class provides fieldnames attribute, returning the dictionary keys used as header of file. print (obj.fieldnames) [”name”, ”age”, ”marks”] Use loop over the DictReader object to fetch individual dictionary objects. for row in obj: print (row) This results in following output − OrderedDict([(”name”, ”Lata”), (”age”, ”22”), (”marks”, ”45”)]) OrderedDict([(”name”, ”Anil”), (”age”, ”21”), (”marks”, ”56”)]) OrderedDict([(”name”, ”John”), (”age”, ”20”), (”marks”, ”60”)]) To convert OrderedDict object to normal dictionary, we have to first import OrderedDict from collections module. from collections import OrderedDict r=OrderedDict([(”name”, ”Lata”), (”age”, ”22”), (”marks”, ”45”)]) dict(r) {”name”: ”Lata”, ”age”: ”22”, ”marks”: ”45”} Print Page Previous Next Advertisements ”;

Python Data Persistence – PyMongo module

Python Data Persistence – PyMongo module ”; Previous Next MongoDB is a document oriented NoSQL database. It is a cross platform database distributed under server side public license. It uses JSON like documents as schema. In order to provide capability to store huge data, more than one physical servers (called shards) are interconnected, so that a horizontal scalability is achieved. MongoDB database consists of documents. A document is analogous to a row in a table of relational database. However, it doesn”t have a particular schema. Document is a collection of key-value pairs – similar to dictionary. However, number of k-v pairs in each document may vary. Just as a table in relational database has a primary key, document in MongoDB database has a special key called “_id”. Before we see how MongoDB database is used with Python, let us briefly understand how to install and start MongoDB. Community and commercial version of MongoDB is available. Community version can be downloaded from www.mongodb.com/download-center/community. Assuming that MongoDB is installed in c:mongodb, the server can be invoked using following command. c:mongodbbin>mongod The MongoDB server is active at port number 22017 by default. Databases are stored in data/bin folder by default, although the location can be changed by –dbpath option. MongoDB has its own set of commands to be used in a MongoDB shell. To invoke shell, use Mongo command. x:mongodbbin>mongo A shell prompt similar to MySQL or SQLite shell prompt, appears before which native NoSQL commands can be executed. However, we are interested in connecting MongoDB database to Python. PyMongo module has been developed by MongoDB Inc itself to provide Python programming interface. Use well known pip utility to install PyMongo. pip3 install pymongo Assuming that MongoDB server is up and running (with mongod command) and is listening at port 22017, we first need to declare a MongoClient object. It controls all transactions between Python session and the database. from pymongo import MongoClient client=MongoClient() Use this client object to establish connection with MongoDB server. client = MongoClient(”localhost”, 27017) A new database is created with following command. db=client.newdb MongoDB database can have many collections, similar to tables in a relational database. A Collection object is created by Create_collection() function. db.create_collection(”students”) Now, we can add one or more documents in the collection as follows − from pymongo import MongoClient client=MongoClient() db=client.newdb db.create_collection(“students”) student=db[”students”] studentlist=[{”studentID”:1,”Name”:”Juhi”,”age”:20, ”marks”=100}, {”studentID”:2,”Name”:”dilip”,”age”:20, ”marks”=110}, {”studentID”:3,”Name”:”jeevan”,”age”:24, ”marks”=145}] student.insert_many(studentlist) client.close() To retrieve the documents (similar to SELECT query), we should use find() method. It returns a cursor with the help of which all documents can be obtained. students=db[”students”] docs=students.find() for doc in docs: print (doc[”Name”], doc[”age”], doc[”marks”] ) To find a particular document instead of all of them in a collection, we need to apply filter to find() method. The filter uses logical operators. MongoDB has its own set of logical operators as below − Sr.No MongoDB operator & Traditional logical operator 1 $eq equal to (==) 2 $gt greater than (>) 3 $gte greater than or equal to (>=) 4 $in if equal to any value in array 5 $lt less than (<) 6 $lte less than or equal to (<=) 7 $ne not equal to (!=) 8 $nin if not equal to any value in array For example, we are interested in obtaining list of students older than 21 years. Using $gt operator in the filter for find() method as follows − students=db[”students”] docs=students.find({”age”:{”$gt”:21}}) for doc in docs: print (doc.get(”Name”), doc.get(”age”), doc.get(”marks”)) PyMongo module provides update_one() and update_many() methods for modifying one document or more than one documents satisfying a specific filter expression. Let us update marks attribute of a document in which name is Juhi. from pymongo import MongoClient client=MongoClient() db=client.newdb doc=db.students.find_one({”Name”: ”Juhi”}) db[”students”].update_one({”Name”: ”Juhi”},{“$set”:{”marks”:150}}) client.close() Print Page Previous Next Advertisements ”;

Python Data Persistence – Home

Python Data Persistence Tutorial PDF Version Quick Guide Resources Job Search Discussion In this tutorial, we will explore various built-in and third party Python modules to store and retrieve data to/from various formats such as text file, CSV, JSON and XML files as well as relational and non-relational databases. This tutorial also introduces ZODB database, which is a persistence API for Python objects. Microsoft Excel format is a very popular data file format. Here, we will learn how to handle .xlsx file through Python. Audience This tutorial is for all the software programmers who have keen interest in learning about data persistence with regards to Python. Prerequisites If you are novice to Python, it is suggested that you go through the tutorials related to Python before proceeding with this one. Print Page Previous Next Advertisements ”;

Python Data Persistence – Object Serialization

Python Data Persistence – Object Serialization ”; Previous Next Python”s built-in file object returned by Python”s built-in open() function has one important shortcoming. When opened with ”w” mode, the write() method accepts only the string object. That means, if you have data represented in any non-string form, the object of either in built-in classes (numbers, dictionary, lists or tuples) or other user-defined classes, it cannot be written to file directly. Before writing, you need to convert it in its string representation. numbers=[10,20,30,40] file=open(”numbers.txt”,”w”) file.write(str(numbers)) file.close() For a binary file, argument to write() method must be a byte object. For example, the list of integers is converted to bytes by bytearray() function and then written to file. numbers=[10,20,30,40] data=bytearray(numbers) file.write(data) file.close() To read back data from the file in the respective data type, reverse conversion needs to be done. file=open(”numbers.txt”,”rb”) data=file.read() print (list(data)) This type of manual conversion, of an object to string or byte format (and vice versa) is very cumbersome and tedious. It is possible to store the state of a Python object in the form of byte stream directly to a file, or memory stream and retrieve to its original state. This process is called serialization and de-serialization. Python’s built in library contains various modules for serialization and deserialization process. Sr.No. Name & Description 1 pickle Python specific serialization library 2 marshal Library used internally for serialization 3 shelve Pythonic object persistence 4 dbm library offering interface to Unix database 5 csv library for storage and retrieval of Python data to CSV format 6 json Library for serialization to universal JSON format Print Page Previous Next Advertisements ”;