qp
arose from a desire to do a comprehensive inventory of certain data science libraries
to solidify my understanding of their components.
Doing so manually would be prohibitively time-consuming,
and helpfully many of the major projects are documented with a tool called Sphinx,
which I know from previous experience produces (as a side effect),
something called "intersphinx inventories",
as an objects.inv
file colocated with the documentation website.
For this project I made a very minimal package to get the functionality I wanted,
and a nice easy CLI with defopt
which let me iterate quickly
(where I usually use the more sophisticated argparse
).
I soon pivoted from the original intention of targeting a single library (pandas) to targeting multiple, and then extended it to target any library from a given URL.
From the README:
- To get a list of all the entities in PyTorch (stable version) and their URLs, run:
sh qp torch -v stable -q | wc -l
⇣3366
- To pull out just the
torch.Tensor
class methods, run:
sh qp torch -v stable --role method --names torch.Tensor -q | wc -l
⇣514
- This has many uses, for example to create a list of markdown format links, pipe it as:
sh echo "$(qp torch -v stable -r method -n torch.Tensor -q)" | \ sed -e 's/ /]: /g' -e 's/^torch\.Tensor\./[/g'
⇣ ```md... ```
The first usage was in creating a lookup page for all methods on PyTorch's Tensor
,
which greatly solidified my grasp on this important class and continues to serve as a reference.
With a few chained bash commands, I was quickly able to set up the skeleton for this mammoth
reference and get a sense of the relative scale of each part, and minimise the amount of manual work
involved in creating such a reference work.