Skip to content

Instantly share code, notes, and snippets.

@mwouts
Last active December 9, 2022 16:47
Show Gist options
  • Save mwouts/b90640b7f28eed1378e6cb43744b5358 to your computer and use it in GitHub Desktop.
Save mwouts/b90640b7f28eed1378e6cb43744b5358 to your computer and use it in GitHub Desktop.
2018-06 Debug Jupyter notebooks with PyCharm

Did you ever had to debug some large cell in a Jupyter notebook? In the below I share my experience on the subject. We'll review the classical methods for debugging notebooks, and finally I'll show how to set breakpoints in PyCharm for code being execute in a jupyter notebook, and benefit of the comfort of a real Python IDE for debugging.

Debug in the notebook

Before I actually describe what Pycharm can do, we quickly review the jupyter commands for debugging.

Catch exceptions

%pdb on is my favorite. It is a magic command that start a debug shell on exceptions (deactivate this mode with %pdb off)

Breakpoint with set_trace

This is the jupyter version for the classical python import pdb; pdb.set_trace(). At the location of the desired breakpoint, insert

import IPython; IPython.core.debugger.set_trace()

Conditional breakpoints

T. Hoffmann proposes an elegant conditional breakpoint function:

from IPython.core.debugger import Pdb as CorePdb
import sys

def breakpoint(condition=True):
    """
    Set a breakpoint at the location the function is called if `condition == True`.
    """
    if condition:
        debugger = CorePdb()
        frame = sys._getframe()
        debugger.set_trace(frame.f_back)
        return debugger


def add(a, b):
    breakpoint(type(a) != type(b))
    return a + b

add('a', 2)

What you get with Jupyter debug commands

Below the cell under debug, you get an input line where you can enter pdb commands.

Most useful commands are:

  • q(uit) to exit the debugger and return to jupyter
  • u(p) to go one frame up (in case you used %pdb on)
  • n(ext), s(tep), r(eturn), c(ontinue) to execute next line, step into function, execute until end of current function, or continue execution
  • p(rint) for printing the content of a variable.

A sample debug session looks like

Don't forget to quit the debugger, otherwise cells won't execute any more. If you forgot, interrupt the kernel - in some case you will recover a functional notebook.

How to debug a notebook with PyCharm

Now we head towards a more comfortable solution! It does requires some work on configuration the first time you use it, yet the second time already the operation will be very easy, believe me.

Configure PyCharm with the same python interpreter as Jupyter

The following assumes you have PyCharm installed.

Identify the python that is used in your jupyter notebook with

import sys
sys.executable

Now create a PyCharm project at the same location than your jupyter notebook. Configure the project to use the the same interpreter as your jupyter:

Move function under debug to a python script

PyCharm only allows you to set breakpoint on python modules and packages.

If you want to do step-by-step execution of a python package, please simply open the desired module in PyCharm, and skip to the next paragraph.

If you want to debug a function you wrote yourself in the notebook, please move the function to a .py file. In this example I create a script.py file with content

def add(x, y):
    return x+y

We now load the autoreload extension with

%load_ext autoreload
%autoreload 2

The extension will make sure jupyter always use the latest version of the script (useful when you fix the bug).
Finally we import the desired function with

from script import add

Attach PyCharm to jupyter kernel

We identify the python process used by the notebook with

%connect_info

which returns, among other lines, one like

if you are local, you can connect with just:
jupyter <app> --existing kernel-d1b9a862-1f04-403b-82fb-5b820c0a0f89.json

Use the above information to attach PyCharm debugger to the python process. Click on Run / Attach to Local Process in PyCharm's menu, and select the process identified by the kernel file:

Interactive debug

We are ready for interactive and comfortable step-by-step debug. In PyCharm, open the file (or package) where the breakpoint is desired, and right-click on the left border to add the breakpoint:

In jupyter, execute the cell that calls the function under debug:

As you see, the execution does not return (yet). PyCharm's breakpoint pauses execution, and offers a comfortable debugger and variable window:

Going further

I find the above very helpful for debugging and understanding the stack trace at specific code locations. But I would also love to

  • catch all exceptions in PyCharm (and reproduce %pdb on)
  • set PyCharm breakpoints directly in jupyter cells.

Please let me know if you have any idea on how to achieve this!

def add(x, y):
return x+y
@mwouts
Copy link
Author

mwouts commented Jul 3, 2020

Hello Tudor, sorry for the delay in answering your question. Can you let us know which OS, version of PyCharm and of Jupyter you are using?

I have been using this extensively on RedHat, with not so latest PyCharm and not so latest Jupyter, and on Windows, with latest PyCharm and latest Jupyter. I have not used that on Ubuntu, and I know that on Ubuntu a specific step is required.

I'd recommend that you start with attaching PyCharm to a running Python script. That's simpler, you can follow the official documentation at https://www.jetbrains.com/help/pycharm/attaching-to-local-process.html, and once you can do that, attaching the notebook kernel should be just a formality.

@tlapusan
Copy link

tlapusan commented Jul 3, 2020

Hi Mark,

thanks for your reply.
These are my versions :

  • macOS Mojave, version 10.14.6
  • pycharm 2019.3.2 (community edition)
  • jupyter core 4.6.3
  • jupyter-notebook 6.0.3

I will try your recommendations and came back with my output.
Thanks.

@tlapusan
Copy link

tlapusan commented Jul 6, 2020

Hi Mark,

I tried your suggestions and succeeded to connect to a normal python program and to make debug on it.

Here are my errors when trying to connect to a notebook (I put a print() to see which ports are already used) :
Screen Shot 2020-07-06 at 10 49 46 AM

Here is the data from %connect_info
Screen Shot 2020-07-06 at 10 49 58 AM

The strange fact is that ZMQ is trying to connect to the same ports from %connect_info. I don't understand why, but I will try to find it out.

Do you have any idea, hints ? Thanks

@mwouts
Copy link
Author

mwouts commented Jul 6, 2020

Hi Tudor,

Well I'd check two things:

  1. Can you connect to the existing kernel using jupyter console --existing kernel-....json in a terminal?
    I don't think it's how PyCharm connects to the process, still it's good to check that everything is allright on the Jupyter side.
  2. Can you take a screenshot of how you attach either the Python script, or try to attach the notebook?
    Do you get something similar to my screenshot below?
    Also, are you using the same Python (for attaching) than the one from your Jupyter kernel? (Unless it needs to be the one you use to launch the server?)

@tlapusan
Copy link

tlapusan commented Jul 7, 2020

Hi Mark, I've created a video with all the steps. I think it will be more explanatory : https://youtu.be/0TRdlyC13Yw

Thanks

@mwouts
Copy link
Author

mwouts commented Jul 7, 2020

Thank you Tudor for the video, that is very helpful!
You're doing it exactly as I am doing on Windows or Linux.

I gave it a try myself on my old mac book pro (El Capitan version 10.11.6, with latest miniconda, latest PyCharm community). Just like you, when I try to attach to the Jupyter kernel, the kernel gets killed, and thus the attach fails. In the console where I started Jupyter I also have an error with ZMQ (but I'm not sure that's the cause of the failing attach).

Would you mind creating a bug report at https://youtrack.jetbrains.com/issues/PY? And share it back here? Thanks!

@tlapusan
Copy link

tlapusan commented Jul 8, 2020

Thanks Mark for your effort in helping me with this issue.
I want to look a little more deeper into the ZMQ error stacktrace. If I will have no good results, sure, I will create a bug report and share the results ;)

@mwouts
Copy link
Author

mwouts commented Jul 8, 2020

Sure! Indeed the mac asks about networking permission when I try to attach the notebook, so maybe this has something to see with the failure.

Also, maybe we can ask @Elizaveta239, she certainly knows pydevd much better than we do, and maybe she could tell us if this notebook attach thing is known to work, or not, on Mac OS...

@mwouts
Copy link
Author

mwouts commented Jul 8, 2020

BTW pydevd is developed at https://github.com/fabioz/PyDev.Debugger.
No issue there mentions ZMQ... but maybe attaching a Jupyter notebook is not a frequent pattern yet 😄.

@tlapusan
Copy link

hi Mark, I tried few more hours to resolve the issues, but without any success.
So, I created a ticket : https://youtrack.jetbrains.com/issue/PY-43448

@wcneill
Copy link

wcneill commented Dec 9, 2022

Unfortunately, this is not working in PyCharm Pro 2022.3

The debugger never connects. I am left with this:
image

There is an issue reported on YouTrack that was cursorily misread and closed as "answered", as is typical with such things. The "answer" was image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment