Fork me on GitHub

Pyodec

An open-source data file decoder

A Python library for simple decoding of a broad range of proprietary or non-structured data files, compiled by the scientific community, for the scientific community.

Save Time

Only write a decoder once, use decoders others have written, and use tools provided in the library to write decoders faster.

Learn more

Decode Better

Minimize lazy shortcuts by writing less, and improving others' work. Get all the data, and get it correctly

Learn more

Simplify Science

With more and better decoders, researchers can skip the frustrating decoding process, and get on with their analysis

Learn more

1. Find your decoder

You may search the file types library, using any information you have about your file (instrument that created it, company, dates)

See included decoders

2. Decode your file

If a decoder exists, all you have to do is

>>> import pyodec
>>> pyodec.decode(filesrc, decoder="myfiledecoder")

or

>>> from pyodec.files.myfiledecoder import decoder as thedecoder
>>> thedecoder.decode(filesrc)

Or one of the several other ways you can call a decoder method on a file or string

The techniques of decoding

Or, write a new file decoder

At the beginning, Pyodec is not going to have every decoder ready to go.

You can use the tools and base classes of Pyodec to write your own standalone decoder, which can then be pulled into the package easily once you have gotten it working.

Learn about writing decoders

Get Pyodec

Latest Release Github Project


How pyodec is already used

This may all seem somewhat trivial, so here is a real-world example of how this software is used.

Ceilometer backscatter

We wanted to read backscatter profiles from an instrument intended to produce only algorithm output from that profile. The backscatter profiles were hidden in log files, which each profile encoded like this:

^A
-2011-01-19 18:00:04
CL017121^B
10 03670 ///// ///// 00000000C000
  8 012  0 ///  0 ///  0 ///  0 ///
00100 10 0770 100 +27 100 01 0025 L0016HN15 221
0228b00f1600cdf0125101541012ab011bd00ff500f6e00e9000dc100cfd00c440....fff4affeccfff750003e00059fff42ffed8ffeefffd4dffcd5ff
^Ca394^D

But, we could read this data as a text file, and once we had a proper decoder written for the long line of data, it was simple to get a set of backscatter profiles from any of these log files.

This is how pyodec came to be. Now, since Vaisala CL-31 data files are supported, all you have to do is call

pyodec.decode('ceilometerfile.dat',decoder='vaisalaCl31msg2')


What Pyodec can do is make reading files like this easier. You need one researcher or developer who can properly develop a decoder, but from that point on, your researchers can use these kinds of data files without having to know how to catch an EOM character, or compute a two's-compliment.

Let's get started!