!!! abstract “Just as attrs and
dataclasses use type hints to simplify data type
definition, scinexus uses them to simplify writing
best-practice scientific algorithms.”
scinexus (pronounced ‘sigh-nexus’) is a Python framework
for rapid development of data processing applications. It enables
interoperability between apps through defined data types, allowing
development of scientific domain app ecosystems (for examples see cogent3 and piqtree).
Many scientific problems require repeating calculations across many files or database records. Such tasks suit data-level parallelism on multi-core CPUs, but writing robust, maintainable code for them is often tedious and quickly becomes complex.
With scinexus apps, you can use a functional programming
style when developing your application. Combined with
scinexus app composition, this greatly simplifies your
programming logic making it easier to understand and thus easier to
explain. And as we know
!!! quote If the implementation is easy to explain, it may be a good idea.
-- Tim Peters, "Zen of Python"
scinexus also provides generally useful utilities for
developers of data analysis applications. Utilities for file IO,
parallel execution, and progress tracking are usable independently of
the app framework.
scinexus – see Installing from PyPIscinexus origin
storyThe app infrastructure code was originally developed within cogent3, where it accumulated over seven
years of development, testing, and real-world use in computational
genomics before being extracted into scinexus. The design
is mature and has underpinned analyses in published studies.
We acknowledge here that many members of the cogent3
community contributed to the code that now lives here, including @GavinHuttley, @rmcar17, @Nick-Foto, @KatherineCaley, @fredjaya, and @khiron.
Failures are automatically recorded as
NotCompleted records which get propagated and stored in data stores. These
records record salient details that help you identify the cause of the
failure.↩︎
tqdm is the default because of its
robustness in notebooks, but you can choose rich.↩︎
The default is Python’s standard library
multiprocessing module. If you’re using Jupyter Notebooks,
however, it’s recommended that you use loky. This is an installation option and configuration
is easy.↩︎