Swipe to navigate through the chapters of this book
In nearly all scientific computing and data analysis applications there is a need for data input and output, for example, to load datasets or to persistently store results. Getting data in and out of programs is consequently a key step in the computational workflow. There are many standardized formats for storing structured and unstructured data. The benefits of using standardized formats are obvious: you can use existing libraries for reading and writing data, saving yourself both time and effort. In the course of working with scientific and technical computing, it is likely that you will face a variety of data formats through interaction with colleagues and peers, or when acquiring data from sources such as equipment and databases. As a computational practitioner, it is important to be able to handle data efficiently and seamlessly, regardless of which format it comes in. This motivates why this entire chapter is devoted to this topic.
Please log in to get access to this content
Although RFC 4180, http://tools.ietf.org/html/rfc4180, is sometimes taken as an unofficial specification, in practice there exist many varieties and dialects of CSV.
This is also known as out-of-core computing. For another recent project that also provides out-of-core computing capabilities in Python, see the dask library ( http://dask.pydata.org/en/latest).
Note that the Python module provided by the PyTables library is named tables. Therefore, tables.open_file refers to open_file function in the tables module provided by the PyTables library.
For more information about JSON, see http://json.org .
An alternative to the pickle module is the cPickle module, which is a more efficient reimplementation that is also available in the Python standard library. See also the dill library at http://trac.mystic.cacr.caltech.edu/project/pathos/wiki/dill .
- Data Input and Output
- Sequence number
- Chapter number
- Chapter 18