Lesser known parts of Python standard library

rbanffy | 278 points

I find defaultdict, OrderedDict, namedtuple among other data structures/classes in the collections module to be incredibly useful.

Another module that's packaged with the stdlib that's immensely useful is itertools. I especially find takewhile, cycle, and chain to be incredibly useful building blocks for list-related functions. I highly recommend a quick read.

EDIT: functools is also great! Fantastic module for higher-order functions on callable objects.

https://docs.python.org/3/library/itertools.html

judicious | 12 days ago

> OrderedDict - dictionary that maintains order of key-value pairs (e.g. when HTTP header value matters for dealing with certain security mechanisms).

Word to the wise... as of Python 3.7, the regular dictionary data structure guarantees order. Declaring an OrderedDict can still be worthwhile for readability (to let code reviewers/maintainers know that order is important) but I don't know of any other reason to use it anymore.

chadash | 11 days ago

If you liked this blog post, I can’t recommend PyMOTW[0] highly enough. It’s my goto for a concise introduction whenever I need to pick up a new Python stdlib module.

[0]: https://pymotw.com/3/

brianyu8 | 12 days ago

Throwing frozensets out, too. If regular sets aren't obscure enough, frozensets might be your thing. It looks like a set, it acts like a set, but its... hashable (for indexing) and (immutable.) Why use this? For algorithms that rely on combinations (not permutations), frozensets can be very useful. E.g. NOT this -- (0, 1) (1, 0) (both distinct using tuples) vs frozenset([0, 1]) ([1, 0] or [0, 1] have the same identity / frozenset.) You can use this for indexing algorithms and things like that. Sometimes, sets are very convenient because they naturally 'normalise' entries into a fixed order. This can simply a lot of code.

Uptrenda | 12 days ago

Adding `array` [0] to the list. It's generally slower than a list, but massively more memory-efficient. You're limited to a heterogeneous type, of course, but they can be quite useful for some operations.

[0]: https://docs.python.org/3/library/array.html

sgarland | 11 days ago

The following are identical

    fractions.Fraction(numerator=1, denominator=3)
    fractions.Fraction(1) / 3
ChainMap is maybe better described/used as inheritance for dicts, where something like

    settings = ChainMap(instance_settings, region_settings, global_settings)
would give you one object to look in.
nick238 | 12 days ago

Add functools to the list. Espacially functools.wraps() and functools.partial().

The stdlib is full of goodies.

Now I always appreciated the battery included logic in python. But I noticed this week that LLM diminish that need. It's so easy to prompt for small utilities and saves you from using entire libraries for a few tools.

And the AI can create doc and tests for them as quickly.

So while I was really enthusiastic things like pairwise() were added to itertools, it's not as revolutionary as before.

BiteCode_dev | 11 days ago

MappingProxyType is another handy one. It wraps a regular dict/Mapping to create a read-only live view of the underlying dict. You can use it to expose a dict that can't be modified, but doesn't need copying.

https://docs.python.org/3/library/types.html#types.MappingPr...

h4l | 11 days ago

To run a localhost webserver on port 8000, serving the content of the current directory:

  python -m http.server
Pass -h for more options.
pjot | 11 days ago

For a funny and insightful tour of the Python "built in" functions, I highly recommend Dave Beazley's talk:

https://www.youtube.com/watch?v=j6VSAsKAj98

alexpotato | 11 days ago

Some modules are essential additions while others are handy so as not to have to manage dependencies.

Good example of the latter use case is the statistics module.

There is a price to pay though: its performance is 10x slower than numpy. So its mostly useful when the required calculation is not a bottleneck.

The benefit is you are good to go (batteries included) without any virtual environmemts, pip's etc.

openrisk | 12 days ago

I just discovered graphlib.TopologicalSorter the other day.

Nice! When you need it, you need it. It's nice not to have to implement it oneself.

dairiki | 11 days ago

> file_url = 'file://' + os.path.realpath('test.html')

You have to encode the file name!

  file_url = 'file://' + urllib.parse.quote(os.path.realpath('test.html'))
teddyh | 11 days ago

This is the reason I love python for small personal projects. I can get up and going in a heartbeat and the stdlib has so much that I'd need. If there was a Flask-style HTTP server and a more requests-like HTTP client in the stdlib, I'd be a content man. Maybe I need to suck it up, but I just find venvs and Python packaging in general annoying to deal with, especially for something small.

That said, Go has those things so it's crept in a little bit into my quick programming, but I'll always love python.

jjice | 11 days ago

> For people eager to join the AI/ML revolution it provides Naive Bayes classifier - an algorithm that can be considered a minimum viable example of machine learning.

I don't think this is true. It allows you to specify and calculate parameters for normal distributions, what allows you to jury rig a naive bayes classifier, what is shown as a doc example. This is not the same as providing a built in classifier.

Qem | 11 days ago

I was not aware of zipapp ... but it's interesting to see it exists as a method for enabling python to run 'zipped packages' ... since python can already do that by default with normal zipfiles, as long as the zipfile appears / is added to the python path (which is roughly analogous to how one can add .jar files to the classpath in java). E.g.:

  export PYTHONPATH="package.zip"
  python3 -m packagename
will work just fine.

(PS. I document this technique in one of my python-template projects: https://git.sr.ht/~tpapastylianou/python-self-contained-runn...)

I suppose, if the intent is to package something in a manner that attempts to make it newbie-proof, then requiring a PYTHONPATH incantation before the python part might be one step too far ... but then again, one could argue the same about people not quite knowing what to do with a .pyz file and getting stuck.

tpoacher | 11 days ago

I see a lot of mentions of itertools in this thread, which is indeed a great library, but I want to mention that itertools.groupby is one of the easiest to misuse functions I've seen. It's not necessarily intuitive that it groups contiguous records. Passing it an unsorted list won't break, but also might not return the results you're expecting.

rurp | 11 days ago

Probably silly, but I went five years of programming in python before I learned about the help function. Only learned about it when I had to take an intro to python class for school.

magicmicah85 | 11 days ago

I've significantly reduced my use of `namedtuple` since DataClasses were introduced, but I confess I never did much performance comparisons between the two.

I assume the `namedtuple` syntax is more pleasing for Functional favorable programmers, but this makes me wonder if the stdlib should choose one of them?

guhcampos | 11 days ago

I am surprised that they didn't mention pack/unpack. And a namedtuple can take the results of unpack which means you can easily parse binary files. Like

    header = Header._make(unpack(f.read(64), header_format))

    print(header)
nurettin | 10 days ago

ChainMap is one of my favourites. I like when I find a use for it. The obvious one is cascading options type thing (like cmdline options -> env -> defaults). I also found a use for it recently when changing the underlying storage layer of a class without breaking the API.

My other favourite parts of the stdlib are functools and itertools. They are both full of stuff that gives you superpowers. I always find it a shame when I see developers do an ad hoc reimplantation of something in functools/itertools.

globular-toast | 11 days ago

TIL decimal and fraction. Pretty cool.

djoldman | 11 days ago

Didn't know "dis". It looks nice!

Everyone these days is using ast [1] but the might be room for dis instead in some cases.

[1] https://docs.python.org/3/library/ast.html#module-ast

flakiness | 11 days ago

I did not know about that webbrowser module. This will definitely come in handy for sure!

joshdavham | 12 days ago

It has an HTTP server that can compete with Flask for simple use.

OutOfHere | 11 days ago

The contrast between links and the background colors is too low, making this very hard to read.

timdiggerm | 11 days ago

what do people use when they want shorthand for something like this:

  a['foo'] = 20
  a['bar'] = 9
where you want to be able to do:

  a.foo = 20
  a.bar = 9
m463 | 11 days ago

contextlib.ExitStack is a lesser known trick for limiting context manager nesting.

skinner927 | 11 days ago