Last updated:
0 purchases
bigtempo 0.38.9
Implementation:
Python 2.7
Status:
Alpha (contract may change)
Download:
http://pypi.python.org/pypi/bigtempo/
Source:
http://github.com/rhlobo/bigtempo/
Keywords:
bigdata, time series, temporal processment, temporal analysis, data processment, data analysis, scalable, distributed, data exploration, python
This is a Python package created to help you build complex hierarchies of processments, each refered as a datasource.
The package was originally conceived to handle temporal data and it is typically used as a colleague of pandas - dealing with time series and dataframes - but it is flexible and can easily be extended to support other data models.
It handles dependency resolution, provides a tagging system that enables querying operations over datasource sets, and much more.
There are other software packages that focus on lower level aspects of data processing, like pandas, numpy, sympy, theano.
This is not a framework to replace these. Instead, it aims to support many of these tools, helping you to stitch many processments together.
It provides a decoupled programming model that was built with scalability support in its heart and it takes care of a lot of the workflow management so that you can focus on the data itself.
Bigtempo aims to provide support an wide range of applications - including artificial intelligence systems - working in data pull fashion.
Its philosophy is to lazyload things as possible: analysis are retrieved from cache if available, processed otherwise.
A datasource serves data through processors that can be used by other datasources (or by you directly) and processors are made to be executed in a distributed fashion, if that is desired.
Keep in mind that the package - although performatic - is in Alpha Stage and, as so,
most of its caching and distributed processing capabilities are still in the owen.
Getting started
You can get started reading an ipython notebook, and for a better understandment of what can be done, you shall take a peek in the pandas introduction.
Example project
If you need more examples, or just feel like checking out how bigtempo can be used in a project, please refer to stockExperiments.
Installation
To install, simply:
$ pip install bigtempo
Or, if you absolutely must:
$ easy_install bigtempo
Dependencies
Both the installation methods above should take care of dependencies on its own, automatically.
The pandas library is the only direct dependency the package has in order to be executed. You should visit its page to find out what it depends on. For best results, we recommend installing optional packages as well.
If you want to run the package tests, or enjoy its testing facilities, you’ll need:
mockito >= 0.5.1
In order to run the tests using the command contained in the bin directory, also install:
nose >= 1.3.0
coverage >= 3.6
pep8 >= 1.4.5
Installing from source
To install bigtempo from source you need:
Clone the git repository:
$ git clone https://github.com/rhlobo/bigtempo.git
Get into the project directory:
$ cd bigtempo
Install dependencies (if you are not using virtualenv, it may need super user privileges):
$ pip install -r requirements.txt
Install it:
$ python setup.py install
Alternatively, you can use pip if you want all the dependencies pulled in automatically (the optional -e option is for installing it in
development mode):
$ pip install -e .
Next versions?
Distributed processing
Build in process pools
Integration with celery
Integration with Apache ZooKeeper and ZeroMQ
Caching
Smart temporal data caching
Compatibility
Python 2.7+
Bug tracker
If you have any suggestions, bug reports or annoyances please report them to our issue_tracker.
Contribute
On the tracker, check for open issues or open a new one to start a discussion around an idea or bug.
Fork the repository on GitHub to start making your changes.
Write a test which shows that the bug was fixed or that the feature works as expected.
Send a pull request and wait until it gets merged and published. Make sure to add yourself to AUTHORS.
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.