What is the difference between 'dump' and 'dumps' from the Pickle module in Python?

3

I've read the Python documentation and also Pickle's own, but I was not able to assimilate the content (lack of examples). On the Web I only found information on how to use dump + load and nothing on dumps + loads

    
asked by anonymous 15.05.2015 / 01:38

1 answer

3

Dump, and load - each parses an open file (or another object with the file interface) - and saves the serialized content of the object in that file; (or carry it out in the case of the load).

Dumps do not have the equivalent file parameter and return the serialized object as a string of bytes. Loads takes a string of bytes as a parameter and returns the reconstructed object. They are used for when you are not going to write the pickle result to a file immediately, but for example, it is serializing objects to pass them over the network, or to another process.

The standard Python library uses dumps and loads internally, precisely in modules such as multiprocessing, to pass objects transparently to other processes.

>>> a = {"b": ["c", "d", {1,2,3}, ({"e": "f"})]}
>>> print a
{'b': ['c', 'd', set([1, 2, 3]), {'e': 'f'}]}
>>> import pickle
>>> b = pickle.dumps(a)
>>> repr(b)
'"    (dp0\nS\'b\'\np1\n(lp2\nS\'c\'\np3\naS\'d\'\np4\nac__builtin__\nset\np5\n((lp6\nI1\naI2\naI3\natp7\nRp8\na(dp9\nS\'e\'\np10\nS\'f\'\np11\nsas."'
>>> c = pickle.loads(b)
>>> c == a
True
>>> c is a
False

Other serialization modules mimic the pickle interface, and have the four methods: dump, dumps, load and loads - this is the case for the json and marshall modules. Json creates a serialized object conforming to the ECMA-404 specification, which is syntactically valid Javascript, almost Python syntactically valid and interchangeable with multiple languages - however it can only serialize a subset of native Python data types (unicode strings, integers and floats, Booleans, None, lists and dictionaries - other strings are converted for lists) - marshall already can serialize all native Python data types: lists, dictionaries, sets, complex numbers, etc ... but will falahr with objects defined in pure Python classes - even if they are in the standard library, such as OrderedDict, namedtuple, and several others.

Pickle will in turn serialize almost everything in the front - including objects of classes defined in its own code, and functions - (with some poréns: deserializa has to "know" the names of the functions and classes of the serialized objects)

Finally, if you really want to serialize crazy things - including functions with your content (code object), there is a module in Pypi that works on top of Pickle and supports this: o dill (and guess what: it has load, lods, dump and dumps)

(Beware of "load" and "dump": they should use open files in binary mode, never in text mode - especially in Python 3.x)

    
15.05.2015 / 06:11