Skip to content

Latest commit

 

History

History
909 lines (700 loc) · 29.3 KB

chapter_23_debugging_prod.asciidoc

File metadata and controls

909 lines (700 loc) · 29.3 KB

Server-Side Debugging

Warning, Chapter Update in Progress

🚧 Warning, Chapter update for 3e in progress.

This chapter has only been partially updated to Django 5, Ansible+Docker, etc.

Following along may be tricky, but I hope to have it in better shape soon!

Popping a few layers off the stack of things we’re working on: we have nice wait-for helpers; what were we using them for? Oh yes, waiting to be logged in. And why was that? Ah yes, we had just built a way of pre-authenticating a user.

The Proof Is in the Pudding: Using Staging to Catch Final Bugs

They’re all very well for running the FTs locally, but how would they work against the staging server? Let’s try to deploy our site. Along the way we’ll catch an unexpected bug (that’s what staging is for!), and then we’ll have to figure out a way of managing the database on the test server:

$ pass:quotes[ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml.yaml -vv]

PLAYBOOK: deploy-playbook.yaml.yaml *
1 plays in infra/deploy-playbook.yaml.yaml
[...]

Here’s what happens when we run the functional tests:

$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests

======================================================================
ERROR: test_logged_in_users_lists_are_saved_as_my_lists (functional_tests.test_
my_lists.MyListsTest.test_logged_in_users_lists_are_saved_as_my_lists)
 ---------------------------------------------------------------------
Traceback (most recent call last):
  File "...goat-book/functional_tests/test_my_lists.py", line 34, in
test_logged_in_users_lists_are_saved_as_my_lists
    self.wait_to_be_logged_in(email)
[...]
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate
element: Log out


======================================================================
FAIL: test_login_using_magic_link (functional_tests.test_login.LoginTest)
 ---------------------------------------------------------------------
Traceback (most recent call last):
  File "...goat-book/functional_tests/test_login.py", line 22, in
test_login_using_magic_link
    self.wait_for(lambda: self.assertIn(
[...]
AssertionError: 'Check your email' not found in 'Server Error (500)'

 ---------------------------------------------------------------------
Ran 8 tests in 68.602s

FAILED (failures=1, errors=1)

We can’t log in—​either with the real email system or with our pre-authenticated session. Looks like our nice new authentication system is crashing the server.

Let’s practice a bit of server-side debugging!

  • TODO: show we can actually repro this with Docker.

Inspecting Logs on the Server

In order to track this problem down, we need to get some logging information out of Django.

First, make sure your 'settings.py' still contains the LOGGING settings which will actually send stuff to the console:

Example 1. src/superlists/settings.py
LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "console": {"class": "logging.StreamHandler"},
    },
    "loggers": {
        "root": {"handlers": ["console"], "level": "INFO"},
    },
}

Restart the Docker container again if necessary, and then either rerun the FT, or just try to log in manually. While that happens, we watch the logs on the server with docker logs -f:

elspeth@server:$ docker logs -f superlists

You should see an error like this:

Internal Server Error: /accounts/send_login_email
Traceback (most recent call last):
  File "/home/elspeth/sites/staging.ottg.co.uk/.venv/lib/python3.7/[...]
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File
"/home/elspeth/sites/staging.ottg.co.uk/accounts/views.py", line
20, in send_login_email
    [email]
[...]
    self.connection.sendmail(from_email, recipients, message.as_bytes(linesep='\r\n'))
  File "/usr/lib/python3.7/smtplib.py", line 862, in sendmail
    raise SMTPSenderRefused(code, resp, from_addr)
smtplib.SMTPSenderRefused: (530, b'5.5.1 Authentication Required. Learn more
at\n5.5.1  https://support.google.com/mail/?p=WantAuthError [...]
- gsmtp', 'noreply@superlists')

Hm, Gmail is refusing to send our emails, is it? Now why might that be? Ah yes, we haven’t told the server what our password is!

Another Environment Variable

Just as in [chapter_11_server_prep], the place we set environment variables on the server is in the superlists.env file. Let’s change it manually, on the server, for a test:

  • TODO: maybe use ansible straight away? Also, need to discuss secret storage locally.

elspeth@server:$ echo EMAIL_PASSWORD=yoursekritpasswordhere >> ~/superlists.env
elspeth@server:$ docker restart superlists

Now if we rerun our FTs, we see a change:

$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests

[...]
Traceback (most recent call last):
  File "...goat-book/functional_tests/test_login.py", line 28, in
test_login_using_magic_link
    email = mail.outbox[0]
IndexError: list index out of range

[...]

selenium.common.exceptions.NoSuchElementException: Message: Unable to locate
element: Log out

The my_lists failure is still the same, but we have more information in our login test: the FT gets further, and the site now looks like it’s sending emails correctly (and the server log no longer shows any errors), but we can’t check the email in the mail.outbox…​

Adapting Our FT to Be Able to Test Real Emails via POP3

Ah. That explains it. Now that we’re running against a real server rather than the LiveServerTestCase, we can no longer inspect the local django.mail.outbox to see sent emails.

First, we’ll need to know, in our FTs, whether we’re running against the staging server or not. Let’s save the staging_server variable on self in base.py:

Example 2. src/functional_tests/base.py (ch21l009)
    def setUp(self):
        self.browser = webdriver.Firefox()
        self.test_server = os.environ.get("TEST_SERVER")
        if self.test_server:
            self.live_server_url = "http://" + self.test_server

Then we build a helper function that can retrieve a real email from a real POP3 email server, using the horrifically tortuous Python standard library POP3 client:

Example 3. src/functional_tests/test_login.py (ch21l010)
import os
import poplib
import re
import time
[...]

    def wait_for_email(self, test_email, subject):
        if not self.test_server:
            email = mail.outbox.pop()
            self.assertIn(test_email, email.to)
            self.assertEqual(email.subject, subject)
            return email.body

        email_id = None
        start = time.time()
        inbox = poplib.POP3_SSL("pop.mail.yahoo.com")
        try:
            inbox.user(test_email)
            inbox.pass_(os.environ["YAHOO_PASSWORD"])
            while time.time() - start < 60:
                # get 10 newest messages
                count, _ = inbox.stat()
                for i in reversed(range(max(1, count - 10), count + 1)):
                    print("getting msg", i)
                    _, lines, __ = inbox.retr(i)
                    lines = [l.decode("utf8") for l in lines]
                    print(lines)
                    if f"Subject: {subject}" in lines:
                        email_id = i
                        body = "\n".join(lines)
                        return body
                time.sleep(5)
        finally:
            if email_id:
                inbox.dele(email_id)
            inbox.quit()
  • TODO: consider not using POP3, maybe file backend instead. discuss tradeoff of testing config vs not needing to test that django can actually send emails

I signed up for a Yahoo account for testing, but you can use any email service you like, as long as it offers POP3 access. You will need to set the YAHOO_PASSWORD environment variable in the console that’s running the FT.

$ echo YAHOO_PASSWORD=otheremailpasswordhere >> .env
$ set -a; source .env; set +a

And then we feed through the rest of the changes to the FT that are required as a result. Firstly, populating a test_email variable, differently for local and staging tests:

Example 4. src/functional_tests/test_login.py (ch21l011-1)
@@ -9,7 +9,6 @@ from selenium.webdriver.common.keys import Keys

 from .base import FunctionalTest

-TEST_EMAIL = "[email protected]"
 SUBJECT = "Your login link for Superlists"


@@ -34,7 +33,6 @@ class LoginTest(FunctionalTest):
                     print("getting msg", i)
                     _, lines, __ = inbox.retr(i)
                     lines = [l.decode("utf8") for l in lines]
-                    print(lines)
                     if f"Subject: {subject}" in lines:
                         email_id = i
                         body = "\n".join(lines)
@@ -49,9 +47,14 @@ class LoginTest(FunctionalTest):
         # Edith goes to the awesome superlists site
         # and notices a "Log in" section in the navbar for the first time
         # It's telling her to enter her email address, so she does
+        if self.test_server:
+            test_email = "[email protected]"
+        else:
+            test_email = "[email protected]"
+
         self.browser.get(self.live_server_url)
         self.browser.find_element(By.CSS_SELECTOR, "input[name=email]").send_keys(
-            TEST_EMAIL, Keys.ENTER
+            test_email, Keys.ENTER
         )

And then modifications involving using that variable and calling our new helper function:

Example 5. src/functional_tests/test_login.py (ch21l011-2)
@@ -69,15 +69,13 @@ class LoginTest(FunctionalTest):
         )

         # She checks her email and finds a message
-        email = mail.outbox[0]
-        self.assertIn(TEST_EMAIL, email.to)
-        self.assertEqual(email.subject, SUBJECT)
+        body = self.wait_for_email(test_email, SUBJECT)

-        # It has a URL link in it
-        self.assertIn("Use this link to log in", email.body)
-        url_search = re.search(r"http://.+/.+$", email.body)
+        # It has a url link in it
+        self.assertIn("Use this link to log in", body)
+        url_search = re.search(r"http://.+/.+$", body)
         if not url_search:
-            self.fail(f"Could not find url in email body:\n{email.body}")
+            self.fail(f"Could not find url in email body:\n{body}")
         url = url_search.group(0)
         self.assertIn(self.live_server_url, url)

@@ -85,10 +83,10 @@ class LoginTest(FunctionalTest):
         self.browser.get(url)

         # she is logged in!
-        self.wait_to_be_logged_in(email=TEST_EMAIL)
+        self.wait_to_be_logged_in(email=test_email)

         # Now she logs out
         self.browser.find_element(By.LINK_TEXT, "Log out").click()

         # She is logged out
-        self.wait_to_be_logged_out(email=TEST_EMAIL)
+        self.wait_to_be_logged_out(email=test_email)

And, believe it or not, that’ll actually work, and give us an FT that can actually check for logins that work, involving real emails!

$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests.test_login
[...]
OK
Note
I’ve just hacked this email-checking code together, and it’s currently pretty ugly and brittle (one common problem is picking up the wrong email from a previous test run). With some cleanup and a few more retry loops it could grow into something more reliable. Alternatively, services like mailinator.com will give you throwaway email addresses and an API to check them, for a small fee.

Managing the Test Database on Staging

Now we can rerun our full FT suite and get to the next failure: our attempt to create pre-authenticated sessions doesn’t work, so the "My Lists" test fails:

$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests

ERROR: test_logged_in_users_lists_are_saved_as_my_lists
(functional_tests.test_my_lists.MyListsTest)
[...]
selenium.common.exceptions.TimeoutException: Message: Could not find element
with id id_logout. Page text was:
Superlists
Sign in
Start a new To-Do list

Ran 8 tests in 72.742s

FAILED (errors=1)

It’s because our test utility function create_pre_authenticated_session only acts on the local database. Let’s find out how our tests can manage the database on the server.

A Django Management Command to Create Sessions

To do things on the server, we’ll need to build a self-contained script that can be run from the command line on the server, most probably via Fabric.

When trying to build a standalone script that works with Django (i.e., can talk to the database and so on), there are some fiddly issues you need to get right, like setting the DJANGO_SETTINGS_MODULE environment variable, and getting sys.path correctly.

Instead of messing about with all that, Django lets you create your own "management commands" (commands you can run with python manage.py), which will do all that path mangling for you. They live in a folder called 'management/commands' inside your apps:

$ mkdir -p src/functional_tests/management/commands
$ touch src/functional_tests/management/__init__.py
$ touch src/functional_tests/management/commands/__init__.py

The boilerplate in a management command is a class that inherits from django.core.management.BaseCommand, and that defines a method called handle:

Example 6. src/functional_tests/management/commands/create_session.py (ch21l012)
from django.conf import settings
from django.contrib.auth import BACKEND_SESSION_KEY, SESSION_KEY, get_user_model
from django.contrib.sessions.backends.db import SessionStore
from django.core.management.base import BaseCommand

User = get_user_model()


class Command(BaseCommand):
    def add_arguments(self, parser):
        parser.add_argument("email")

    def handle(self, *args, **options):
        session_key = create_pre_authenticated_session(options["email"])
        self.stdout.write(session_key)


def create_pre_authenticated_session(email):
    user = User.objects.create(email=email)
    session = SessionStore()
    session[SESSION_KEY] = user.pk
    session[BACKEND_SESSION_KEY] = settings.AUTHENTICATION_BACKENDS[0]
    session.save()
    return session.session_key

We’ve taken the code for create_pre_authenticated_session from 'test_my_lists.py'. handle will pick up an email address from the parser, and then return the session key that we’ll want to add to our browser cookies, and the management command prints it out at the command line. Try it out:

$ python src/manage.py create_session [email protected]
Unknown command: 'create_session'. Did you mean clearsessions?

One more step: we need to add functional_tests to our 'settings.py' for it to recognise it as a real app that might have management commands as well as tests:

Example 7. src/superlists/settings.py (ch21l014)
+++ b/superlists/settings.py
@@ -42,6 +42,7 @@ INSTALLED_APPS = [
     "lists",
     "accounts",
+    "functional_tests",
 ]

Now it works:

$ python src/manage.py create_session [email protected]
qnslckvp2aga7tm6xuivyb0ob1akzzwl
Note
If you see an error saying the auth_user table is missing, you may need to run manage.py migrate. In case that doesn’t work, delete the db.sqlite3 file and run migrate again, to get a clean slate.

Getting the FT to Run the Management Command on the Server

Next we need to adjust test_my_lists so that it runs the local function when we’re on the local server, and make it run the management command on the staging server if we’re on that:

Example 8. src/functional_tests/test_my_lists.py (ch21l016)
from django.conf import settings

from .base import FunctionalTest
from .management.commands.create_session import create_pre_authenticated_session
from .server_tools import create_session_on_server


class MyListsTest(FunctionalTest):
    def create_pre_authenticated_session(self, email):
        if self.test_server:
            session_key = create_session_on_server(self.test_server, email)
        else:
            session_key = create_pre_authenticated_session(email)

        ## to set a cookie we need to first visit the domain.
        ## 404 pages load the quickest!
        self.browser.get(self.live_server_url + "/404_no_such_url/")
        self.browser.add_cookie(
            dict(
                name=settings.SESSION_COOKIE_NAME,
                value=session_key,
                path="/",
            )
        )

    [...]

Let’s also tweak base.py, to gather a bit more information when we populate self.test_server:

Example 9. src/functional_tests/base.py (ch21l017)
from .server_tools import reset_database  #(1)
[...]

class FunctionalTest(StaticLiveServerTestCase):
    def setUp(self):
        self.browser = webdriver.Firefox()
        self.test_server = os.environ.get("TEST_SERVER")
        if self.test_server:
            self.live_server_url = "http://" + self.test_server
            reset_database(self.test_server)  #(1)
  1. This will be our function to reset the server database in between each test. We’ll write that next, using Fabric.

Using Fabric Directly from Python

  • TODO: rewrite for ansible.

Rather than using the fab command, Fabric provides an API that lets you run Fabric server commands directly inline in your Python code. You just need to let it know the "host string" you’re connecting to:

Example 10. src/functional_tests/server_tools.py (ch18l018)
from fabric.api import run
from fabric.context_managers import settings, shell_env


def _get_manage_dot_py(host):
    return f"~/sites/{host}/.venv/bin/python ~/sites/{host}/manage.py"


def reset_database(host):
    manage_dot_py = _get_manage_dot_py(host)
    with settings(host_string=f"elspeth@{host}"):  # (1)
        run(f"{manage_dot_py} flush --noinput")  # (2)
  1. Here’s the context manager that sets the host string, in the form 'user@server-address' (I’ve hardcoded my server username, elspeth, so adjust as necessary).

  2. Then, once we’re inside the context manager, we can just call Fabric commands as if we’re in a fabfile.

For creating the session, we have a slightly more complex procedure, because we need to extract the SECRET_KEY and other env vars from the current running server, to be able to generate a session key that’s cryptographically valid for the server:

Example 11. src/functional_tests/server_tools.py (ch18l019)
def _get_server_env_vars(host):
    env_lines = run(f"cat ~/sites/{host}/.env").splitlines()  # (1)
    return dict(l.split("=") for l in env_lines if l)


def create_session_on_server(host, email):
    manage_dot_py = _get_manage_dot_py(host)
    with settings(host_string=f"elspeth@{host}"):
        env_vars = _get_server_env_vars(host)
        with shell_env(**env_vars):  # (2)
            session_key = run(f"{manage_dot_py} create_session {email}")  # (3)
            return session_key.strip()
  1. We extract and parse the server’s current environment variables from the .env file…​

  2. In order to use them in another fabric context manager, shell_env, which sets the environment for the next command…​

  3. Which is to run our create_session management command, which calls the same create_pre_authenticated_session function, but on the server.

Recap: Creating Sessions Locally Versus Staging

Does that all make sense? Perhaps a little ascii-art diagram will help:

Locally:
+-----------------------------------+       +-------------------------------------+
| MyListsTest                       |  -->  | .management.commands.create_session |
| .create_pre_authenticated_session |       |  .create_pre_authenticated_session  |
|            (locally)              |       |             (locally)               |
+-----------------------------------+       +-------------------------------------+
Against staging:
+-----------------------------------+       +-------------------------------------+
| MyListsTest                       |       | .management.commands.create_session |
| .create_pre_authenticated_session |       |  .create_pre_authenticated_session  |
|            (locally)              |       |            (on server)              |
+-----------------------------------+       +-------------------------------------+
            |                                                   ^
            v                                                   |
+----------------------------+     +--------+      +------------------------------+
| server_tools               | --> | fabric | -->  | ./manage.py create_session   |
| .create_session_on_server  |     |  "run" |      |   (on server, using .env)    |
|        (locally)           |     +--------+      +------------------------------+
+----------------------------+

In any case, let’s see if it works. First, locally, to check that we didn’t break anything:

$ python src/manage.py test functional_tests.test_my_lists
[...]
OK

Next, against the server.

$ pass:quotes[ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml.yaml -vv]

And now we run the test:

$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test \
 functional_tests.test_my_lists
[...]
[[email protected]] run:
~/sites/staging.ottg.co.uk/.venv/bin/python
~/sites/staging.ottg.co.uk/manage.py flush --noinput
[...]
[[email protected]] run:
~/sites/staging.ottg.co.uk/.venv/bin/python
~/sites/staging.ottg.co.uk/manage.py create_session [email protected]
[...]
.
 ---------------------------------------------------------------------
Ran 1 test in 5.701s

OK

Looking good! We can rerun all the tests to make sure…​

$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests
[...]
[[email protected]] run:
~/sites/staging.ottg.co.uk/.venv/bin/python
[...]
Ran 8 tests in 89.494s

OK

Hooray!

Note
I’ve shown one way of managing the test database, but you could experiment with others—​for example, if you were using MySQL or Postgres, you could open up an SSH tunnel to the server, and use port forwarding to talk to the database directly. You could then amend settings.DATABASES during FTs to talk to the tunnelled port. You’d still need some way of pulling in the staging server environment variables though.
Warning: Be Careful Not to Run Test Code Against the Live Server

We’re into dangerous territory, now that we have code that can directly affect a database on the server. You want to be very, very careful that you don’t accidentally blow away your production database by running FTs against the wrong host.

You might consider putting some safeguards in place at this point. For example, you could put staging and production on different servers, and make it so they use different keypairs for authentication, with different passphrases.

This is similarly dangerous territory to running tests against clones of production data. I have a little story about accidentally sending thousands of duplicate invoices to clients in [data-migrations-appendix]. LFMF.

Updating our Deploy Script

  • TODO: ansible.

Before we finish, let’s update our deployment fabfile so that it can automatically add the EMAIL_PASSWORD to the .env file on the server:

Example 12. src/deploy_tools/fabfile.py (ch18l021)
import os
[...]


def _create_or_update_dotenv():
    append(".env", "DJANGO_DEBUG_FALSE=y")
    append(".env", f"SITENAME={env.host}")
    current_contents = run("cat .env")
    if "DJANGO_SECRET_KEY" not in current_contents:
        new_secret = "".join(
            random.SystemRandom().choices("abcdefghijklmnopqrstuvwxyz0123456789", k=50)
        )
        append(".env", f"DJANGO_SECRET_KEY={new_secret}")
    email_password = os.environ["EMAIL_PASSWORD"]  # (1)
    append(".env", f"EMAIL_PASSWORD={email_password}")  # (1)
  1. We just add two lines at the end of the script which will essentially copy the local EMAIL_PASSWORD environment variable up to the server’s .env file.

Wrap-Up

Actually getting your new code up and running on a server always tends to flush out some last-minute bugs and unexpected issues. We had to do a bit of work to get through them, but we’ve ended up with several useful things as a result.

We now have a lovely generic wait decorator which will be a nice Pythonic helper for our FTs from now on. We have test fixtures that work both locally and on the server, including the ability to test "real" email integration. And we’ve got some more robust logging configuration.

But before we can deploy our actual live site, we’d better actually give the users what they wanted—​the next chapter describes how to give them the ability to save their lists on a "My Lists" page.

Lessons Learned Catching Bugs in Staging
Fixtures also have to work remotely

LiveServerTestCase makes it easy to interact with the test database using the Django ORM for tests running locally. Interacting with the database on the staging server is not so straightforward. One solution is Fabric and Django management commands, as I’ve shown, but you should explore what works for you—​SSH tunnels, for example.

Be very careful when resetting data on your servers

A command that can remotely wipe the entire database on one of your servers is a dangerous weapon, and you want to be really, really sure it’s never accidentally going to hit your production data.

Logging is critical to debugging issues on the server

At the very least, you’ll want to be able to see any error messages that are being generated by the server. For thornier bugs, you’ll also want to be able to do the occasional "debug print", and see it end up in a file somewhere.