!!! abstract “”
How to use `scitrack` logging in apps, control logging in `apply_to`, and access citation records from composed pipelines.
scitrack for reproducibilityWe reproduce here one of the examples from scitrack.
???- example “Using scitrack in a click
app”
```python linenums="1"
import click
from scitrack import CachingLogger
LOGGER = CachingLogger()
@click.command()
@click.option("-i", "--infile", type=click.Path(exists=True))
@click.option("-t", "--test", is_flag=True, help="Run test.")
def main(infile, test):
# capture the local variables, at this point just provided arguments
LOGGER.log_args() # (1)!
LOGGER.log_versions("numpy") # (2)!
LOGGER.input_file(infile) # (3)!
LOGGER.log_file_path = "some_path.log" # (4)!
if __name__ == "__main__":
main()
```
1. :man_raising_hand: A single statement and you have captured all the input arguments and their values, including defaults!
2. This captures the version numbers of the packages our application depends on.
3. This logs the path to `infile` and its md5sum.
4. Until you assign the path where you want the file written, this content has been cached.
apply_toBy default, apply_to creates a
CachingLogger that records the composable function, package
versions, output paths, MD5 checksums of every result, and total elapsed
time. The log is then written into the output data store. This is the
recommended setting for production analyses because it gives you a
complete, self-contained record of what ran and what it produced.
python { notest } result = process.apply_to(dstore) # logger=True by default
You can also pass your own CachingLogger instance if you
want to configure it beforehand or reuse one across multiple calls.
```python { notest } from scitrack import CachingLogger
LOGGER = CachingLogger() LOGGER.log_args() result = process.apply_to(dstore, logger=LOGGER)
### Disabling logging
Set `logger=False` to skip logging entirely.
```python { notest }
result = process.apply_to(dstore, logger=False)
This is useful when:
CachingLogger that wraps
several apply_to calls.Correctly attributing the authors of algorithms and software is a
requirement of good scientific practice. scinexus makes
this easy by letting app authors declare citations that are
automatically tracked through composed pipelines.
Use the cite parameter of define_app (or
the base classes) to attach a citation. The citeable
library provides several classes for this purpose.
???- example “Adding a citation to your app”
```python { linenums="1" notest }
from citeable import Software
from scinexus import define_app
from cogent3.app.typing import AlignedSeqsType
from cogent3 import get_app
my_cite = Software(
author=["Doe, J", "Smith, A"],
title="My Sequence Filter",
year=2025,
url="https://example.com/my-filter",
version="0.1.0",
)
@define_app(cite=my_cite) # (1)!
def strict_filter(val: AlignedSeqsType) -> AlignedSeqsType:
"""Remove sequences shorter than the alignment."""
return val.omit_bad_seqs()
app = strict_filter()
loader = get_app("load_aligned", moltype="dna", format_name="fasta")
pipeline = loader + strict_filter()
print(pipeline.citations) # (2)!
print(f"\n{pipeline.bib}") # (3)!
# (Software( author=['Doe, J', 'Smith, A'], title='My Sequence Filter',
# year=2025, version='0.1.0', url='https://example.com/my-filter', [...]
```
1. Use the `cite` parameter of `define_app` to attach a citation
2. The `.citations` property returns citations as a tuple. When apps are composed into a pipeline, `.citations` collects unique citations from all apps in the chain.
3. The `.bib` gives the BibTeX string.
When a composed pipeline is run via apply_to(),
citations are automatically saved in the output data store.
???- example “Citations in data stores”
```python { linenums="1" notest }
from citeable import Software
from scinexus import define_app, open_data_store
from cogent3.app.typing import AlignedSeqsType
from cogent3 import get_app
my_cite = Software(
author=["Doe, J"],
title="My Sequence Filter",
year=2025,
)
@define_app(cite=my_cite)
def strict_filter(val: AlignedSeqsType) -> AlignedSeqsType:
return val.omit_bad_seqs()
in_dstore = open_data_store("data/raw.zip", suffix="fa", limit=5)
out_dstore = open_data_store("cited_results", suffix="fa", mode="w")
loader = get_app("load_aligned", moltype="dna", format_name="fasta")
writer = get_app("write_seqs", data_store=out_dstore, format_name="fasta")
process = loader + strict_filter() + writer
result = process.apply_to(in_dstore)
result.summary_citations # (1)!
result.write_bib("my_analysis.bib") # (2)!
```
1. Because we are using `cogent3`, the property returns a `cogent3` `Table` of all citations stored in the data store.
2. You can export to a BibTeX file.
!!! note ReadOnlyDataStoreZipped supports reading stored
citations but not writing them.