Last updated:
0 purchases
avalongenerator 1.1.0
Avalon is a extendable scalable high-performance streaming data
generator that can be used to simulate the real-time input for various
systems.
Installation
To install avalon with all of its dependencies yon can use pip:
pip install avalon-generator[all]
Avalon supports a lot of command-line arguments, so you probably want to
enable its argcomplete
support for tab completion of arguments. Just run the following command
for a single use or add it to your ~/.bashrc to preserve it for the
future uses:
eval "$(avalon --completion-script=bash)"
Also if you install Avalon on Ubuntu using PPA the command line auto
completion will be enabled automatically.
Installation on Ubuntu
There is a
PPA for
Avalon which you may prefer to use if you are using Ubuntu. You can
install Avalon using the PPA with the following commands:
sudo add-apt-repository ppa:mrazavi/avalon
sudo apt update
sudo apt install avalon
Usage
At the most simple from you can name a model as the command line
argument of avalon and it will produce data for the specified model
on the standard output. The following command uses the --textlog
shortcut to generate logs similar to snort
IDS:
avalon snort --textlog
Multiple models could be used at the same time. You can also see the
available models by the following command:
avalon --list-models
The default output format (without --textlog) is json-lines
which output a JSON document on each line. Other formats like csv is
also supported. To see the supported formats you can use the --help
argument and checkout the options for --output-format, or just
enable auto-complete and press <tab> key to see the available options.
Besides --output-format, the output media could also be specified
via --output-media. A lot of output mediums like file, http,
grpc, kafka, direct insert on sql databases are also
supported out of the box.
Also, the number and the rate of the outputs could be controlled via
--number and --rate arguments.
For high rates, you might want to utilize your multiple CPU cores. To do
so, just prefix your model name with the number of instances you want to
run at the same time, e.g. 10snort to run 10 snort instances
(with 10 Python processes that could utilize up to 10 CPU cores).
You can utilize multiple models at the same time. You can also provide a
ratio for the output of each model, e.g. 10snort1000 5asa20. That
means 10 instances of snort model and 5 instances of asa model
with the ratio 1000 output for snort producers to 20 for asa
producers.
The other important parameter to archived high resource utilization is
by increasing the batch size by --batch-size argument.
Also, --output-writers argument determines the simultaneous writes
to the output media. So if your sink is a file or a http server
or any other forms of mediums that supports concurrent writes it is
possible to provide --output-writers to tune the parallelism.
Here is an example that use multiple processes to write to a CSV file,
10000 items per second.
# You don't need to enter --output-media=file because
# Avalon will automatically infer it after you enter an
# argument such as --file-name
#
avalon 20snort 5asa \
--batch-size=1000 --rate=10000 --number=1000000 --output-writers=25 \
--output-format=headered-csv --file-name=test.csv
Avalon command line supports many more options that you could explore
them via --help argument or auto-complete by pressing <tab> key in
the command line.
Architecture
Avalon architecture consists of several abstractions that give it great
flexibility:
Model
Each model is responsible to generate a specific kind of data. For
example a model might generate data similar to logs of a specific
application or appliance while another model might generate network
flows or packets.
Model output is usually an unlimited iteration of Python
dictionaries.
Mapping
Mappings could transform data model for a different purpose. For
example one might want to use different key names in a JSON or
different column names in CSV or SQL database. You can specify a
chain of multiple mappings to achieve your goal.
Format
Each format (or formatter) is responsible for converting a batch of
model data to a specific format, e.g. JSON or CSV.
Format output is usually a string or bytes array, although other
types could also be used according to the output media.
Media
Each media is responsible for transferring the batched formatted data
to a specific data sink. For example it could write data to a file or
send it to a remote server via network.
Generic Extension
Generics, currently in Beta stage, are a brand new type of extensions
that gives the user ultimate flexibility to modify input arguments or
execute any tasks according to them.
Extension
Avalon supports third-party extensions. So, you can develop your own
models, formats, etc. to generate data for your specific use cases or
send them to a sink that Avalon does not support out of the box.
You can also publish your developed extensions publicly if you think
they could benefit other users.
More information is available at EXTENSIONS.org.
Mappings
Although developing and running an Avalon extension is as trivial as
creating a specific directory structure and running avalon command
with a specific PYTHONPATH environment variable, there is an even
simpler method that might comes handy when you want to use a
user-defined mapping.
A mapping could modify the model output dictionary before being used by
the formatter. Avalon supports a couple of useful mappings out of the
box, but new mappings could also be defined in a simple Python script
and passing the file path as a URL in the avalon command line.
For example, the following script if put in a mymap.py file could be
used as a mapping:
# Any valid name for the class is acceptable.
class MyMap:
def map(self, item):
# Item is the dictionary generated by the models
# Rename "foo" key to "bar"
item["bar"] = item.pop("foo", None)
item["new"] = "a whole new key value"
# Don't forget to reutrn the item
return item
NOTE: Despite normal extension mappings which has to inherit from a
specific base class, the mappings passed as file:// URLs to
avalon does not have such obligations.
Now, the mapping could be passed to Avalon with --map as a URL:
avalon --map=file:///path/to/mymap.py
Avalon also supports passing multiple --map arguments and all the
provided mappings will be applied in the specified order. One particular
useful use-case is to define many simple mappings and combine them do
achieve the desired goal.
Also using curly braces you can pass a mapping to only a specific model
when combining multiple models. Here is an example:
# mymap.py will applied to the first snort, the internal jsoncolumn
# mapping will be applied to asa and the last snort will be used
# without any mappings.
avalon "snort{file:///path/to/mymap.py} asa{jsoncolumn} snort"
Etymology
The Avalan name is based on the name of a legendary island featured
in the Arthurian legend and it has nothing to do with the proprietary
Spirent
Avalanche
traffic generator.
Authors
Mohammad Razavi
Mohammad Reza Moghaddas
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.