Friday, April 20, 2018

except BaseException or except Exception

I've recently become the 'python expert' in a small company that had a bunch of people who were writing python code, but didn't really have anyone that had studied python. Our code had a lot of cases of this:

try:
    <probably not foolish thing>
except:
    <deal with exceptions>

I've been dutifully correcting it to, and teaching the others to correct it to, this:

try:
    <probably not foolish thing>
except Exception as err:
    <deal with err>

But today I was code reviewing a change where a colleague had applied an automated linter, and one of those instances had been replaced by

try:
    <probably not foolish thing>
except BaseException as err:
    <deal with err>

I had a feeling this was wrong. But I didn't want to just go with a feeling on it and tried to google what circumstances I should use which in. I failed, so I asked twitter.

Essentially, Exception is a subclass of BaseException. So are several things that you probably really do want to kill your program (the one most noted in the twitter replies was ctrl + c / keyboard interrupt). So, except BaseException does what a bare except did, which is why the linter suggested it. But except Exception is almost certainly what you meant.


From the python docs

BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StopAsyncIteration
      +-- ArithmeticError
      |    +-- FloatingPointError
      |    +-- OverflowError
      |    +-- ZeroDivisionError
      +-- AssertionError
      +-- AttributeError
      +-- BufferError
      +-- EOFError
      +-- ImportError
      |    +-- ModuleNotFoundError
      +-- LookupError
      |    +-- IndexError
      |    +-- KeyError
      +-- MemoryError
      +-- NameError
      |    +-- UnboundLocalError
      +-- OSError
      |    +-- BlockingIOError
      |    +-- ChildProcessError
      |    +-- ConnectionError
      |    |    +-- BrokenPipeError
      |    |    +-- ConnectionAbortedError
      |    |    +-- ConnectionRefusedError
      |    |    +-- ConnectionResetError
      |    +-- FileExistsError
      |    +-- FileNotFoundError
      |    +-- InterruptedError
      |    +-- IsADirectoryError
      |    +-- NotADirectoryError
      |    +-- PermissionError
      |    +-- ProcessLookupError
      |    +-- TimeoutError
      +-- ReferenceError
      +-- RuntimeError
      |    +-- NotImplementedError
      |    +-- RecursionError
      +-- SyntaxError
      |    +-- IndentationError
      |         +-- TabError
      +-- SystemError
      +-- TypeError
      +-- ValueError
      |    +-- UnicodeError
      |         +-- UnicodeDecodeError
      |         +-- UnicodeEncodeError
      |         +-- UnicodeTranslateError
      +-- Warning
           +-- DeprecationWarning
           +-- PendingDeprecationWarning
           +-- RuntimeWarning
           +-- SyntaxWarning
           +-- UserWarning
           +-- FutureWarning
           +-- ImportWarning
           +-- UnicodeWarning
           +-- BytesWarning
           +-- ResourceWarning

Saturday, January 13, 2018

When json.dumps(json.loads(x)) != x

I hit a minor bug in some code where it seemed like .readlines() was stripping lines and .writelines() wasn't putting them back in. I started adding them back manually, but it irked me.

Following up on a twitter conversation about it, I saw that it wasn't actually readlines and writelines that was the problem. When I read the data, I was doing a json.loads() to turn the data from a string into a dictionary, and then later using json.dumps() to turn that (now modified) dictionary back into a string. See the problem yet?


code demonstrating that loading and then dumping a json string with new lines removes them

json.loads turns the string, however messy, into a nice, neat python object. But json.dumps creates the neatest possible json form of that object. So any extra white space, including those new lines I expected, gets stripped out in the process.

I'm currently still of the opinion that readlines should strip out those extra lines, and writelines should put them in. If you're counting the new line character as the way you know to go on to a new thing, they how can it also be part of that first thing? Shouldn't it behave more like .split('\n')? But I'm open to convincing on this.