Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Next.js server instrumentation #2959

Merged
merged 70 commits into from
Oct 28, 2022
Merged

feat: Next.js server instrumentation #2959

merged 70 commits into from
Oct 28, 2022

Conversation

trentm
Copy link
Member

@trentm trentm commented Sep 29, 2022

Closes: #1611

Planned commit message:

feat: Next.js server instrumentation

This adds instrumentation of the Next.js dev server (`next dev`) and prod
server (`next start`). The APM transactions for incoming HTTP requests to the
server will be named appropriately based on Next.js's routing -- both for
user page routes (e.g. `GET /a-dynamic-page/[id]`) and for internal Next.js
routes (e.g. `Next.js _next/data route my-page`,
`Next.js Rewrite route /foo -> /bar`). As well, exceptions in server-side code
(e.g. `getServerSideProps`, server-side run page handlers, API handlers) will
be reported.

This is a "technical preview", meaning we might change how this works in
future versions -- including dropping the separate "start-next.js" start
module.

This also makes a change to stacktrace reporting. If a stack frame refers to
a "*.min.js" file path and does not have source-map information, it is assumed
to be a minimized file and source file context is *not* reported. This is to
avoid excessively large stacktrace data. Reporting source file lines for
name-minimized and 500+ character lines is not useful.

Checklist

  • Implement code
  • Add tests
  • Update TypeScript typings
  • Update documentation
  • Add CHANGELOG.asciidoc entry
  • Commit message follows commit guidelines
  • Remove the following "XXX" notes before merging. They are about pointing docs/examples to the feature branch until merged to main:
examples/nextjs/package.json
11:  "XXX": "change to 'main' branch for merge, and issue to update this on release",

docs/nextjs.asciidoc
50:NOTE: XXX Until this is merged, the example is at: <https://github.com/elastic/apm-agent-nodejs/tree/trentm/feat-nextjs/examples/nextjs/>
63:NOTE: XXX Until this is in a release use `npm install "elastic-apm-node@elastic/apm-agent-nodejs#trentm/feat-nextjs"`.

@trentm trentm self-assigned this Sep 29, 2022
@github-actions github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Sep 29, 2022
@apmmachine
Copy link
Contributor

apmmachine commented Sep 29, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-10-27T18:47:41.575+0000

  • Duration: 23 min 11 sec

Test stats 🧪

Test Results
Failed 0
Passed 322664
Skipped 0
Total 322664

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run module tests for <modules> : Run TAV tests for one or more modules, where <modules> can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules

  • run benchmark tests : Run the benchmark test only.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

…andling

- Set trans.name before a possible captureError handling a route. This
  is because `captureError(err)` will set `error.transaction.*` fields.
  We want the `trans.name` to be set by then.
- Ensure that when an *error* page component is being loaded, it isn't
  used for the transaction name.
Do a first pass at eslinting test/nextjs/a-nextjs-app, but then get
eslint to skip that dir because it can't handle
    'Link' is defined but never used
with JSX templates and I can't be bother to solve configuring eslint for
this subdir
…the recursive link for the 'elastic-apm-node' dep
… class load because after ctor is too late to catch generateRoutes for the prod server
Test hang fix: spawn `node .../.bin/next ...` directly rather than via
npm, else the proc.kill() sends a signal to a defunct process and does
nothing).
- skip Next.js tests with node v14's that are <v14.5.0 because Next doesn't support it
- Attempt to fix spawn on Windows ... because in the Windows ci the 'node_modules/.bin/next' is a *shell* script!?"
  In the Jenkins CI test step for Windows it was failing with:

    [2022-10-13T15:29:36.112Z] # [Next.js server stderr] C:\Users\jenkins\workspace\PR-2959-14-a34a304d-a2cd-4bb2-88e1-2b7d2fddc0dd\src\github.com\elastic\apm-agent-nodejs\test\nextjs\a-nextjs-app\node_modules\.bin\next:2
    [2022-10-13T15:29:36.112Z] # basedir=$(dirname "$(echo "$0" | sed -e 's,\\,/,g')")
    [2022-10-13T15:29:36.112Z] # ^^^^^^^
    [2022-10-13T15:29:36.112Z] #
    [2022-10-13T15:29:36.112Z] # SyntaxError: missing ) after argument list
…lower CI

There is something I don't grok on a Windows install of next.
The "node_modules/.bin/next" is a *shell script*!? The current spawn
test code results in this on Windows:

[2022-10-13T18:46:50.744Z]       Error: spawn ./node_modules/.bin/next ENOENT
[2022-10-13T18:46:50.744Z]           at Process.ChildProcess._handle.onexit (internal/child_process.js:277:19)
[2022-10-13T18:46:50.744Z]           at onErrorNT (internal/child_process.js:472:16)
[2022-10-13T18:46:50.744Z]           at processTicksAndRejections (internal/process/task_queues.js:82:21)

Perhaps adding shell would help. Really should grok why
shell script vs a node script on macos and Linux.
…reement on default from server and client (as discussed earlier here #2380 (comment))
This time was getting:

    [2022-10-13T20:38:08.562Z] # [Next.js server stderr] '.' is not recognized as an internal or external command,
    [2022-10-13T20:38:08.562Z] # operable program or batch file.
…JUnit XML files including 'next dev' ANSI escapes

 Failed to read test report file /var/lib/jenkins/workspace/PR-2959-19-d0425736-922e-4205-8d3a-b205f2e2877e/src/github.com/elastic/apm-agent-nodejs/test_output/test-nextjs-a-nextjs-app-next.test.js.tap.junit.xml
org.dom4j.DocumentException: Error on line 132 of document  : An invalid XML character (Unicode: 0x1b) was found in the element content of the document.
	at org.dom4j.io.SAXReader.read(SAXReader.java:511)
	at org.dom4j.io.SAXReader.read(SAXReader.java:392)
	at hudson.tasks.junit.SuiteResult.parse(SuiteResult.java:177)
	at hudson.tasks.junit.TestResult.parse(TestResult.java:384)
	at hudson.tasks.junit.TestResult.parsePossiblyEmpty(TestResult.java:314)
	at hudson.tasks.junit.TestResult.parse(TestResult.java:256)
	at hudson.tasks.junit.TestResult.parse(TestResult.java:242)
	at hudson.tasks.junit.TestResult.parse(TestResult.java:220)
	at hudson.tasks.junit.TestResult.<init>(TestResult.java:174)
	at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:176)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3502)
	at hudson.remoting.UserRequest.perform(UserRequest.java:211)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:376)
	at hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:122)
	at java.lang.Thread.run(Thread.java:750)
Caused by: org.xml.sax.SAXParseException; lineNumber: 132; columnNumber: 88; An invalid XML character (Unicode: 0x1b) was found in the element content of the document.
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:178)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:399)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:326)
	at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1466)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2922)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:601)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:504)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:642)
	at org.dom4j.io.SAXReader.read(SAXReader.java:494)
	... 19 more
@trentm trentm marked this pull request as ready for review October 20, 2022 17:02
@trentm trentm requested review from astorm and bmorelli25 October 20, 2022 17:02
@trentm
Copy link
Member Author

trentm commented Oct 20, 2022

@bmorelli25 I'd love your feedback, if you have a chance, on the added "Get started with Next.js" doc. A current build of that is here: https://apm-agent-nodejs_2959.docs-preview.app.elstc.co/guide/en/apm/agent/nodejs/master/nextjs.html

@astorm Here are some possibly helpful review notes. Please ask if you have Qs.

  • To play around with the instrumentation there is the simple "examples/nextjs/" app. However, a better play app is the one used for testing:
cd test/instrumentation/modules/next/a-nextjs-app
npm ci

# start a mockapmserver

npm run dev   # to run the dev server

npm run build && npm start  # to run the prod server

# Open http://localhost:3000/ in your browser and click around, or try some of these:
curl -i localhost:3000/a-page/
curl -i localhost:3000/a-page-redirect

curl -i localhost:3000/rewrite-to-a-page
curl -i localhost:3000/rewrite-to-a-dynamic-page
curl -i localhost:3000/rewrite-to-a-public-file
curl -i localhost:3000/rewrite-to-a-404
curl -i localhost:3000/rewrite-external/foo

curl -i localhost:3000/
curl -i localhost:3000/a-page
curl -i localhost:3000/a-dynamic-page/42
curl -i localhost:3000/an-ssr-page
curl -i localhost:3000/an-ssr-error-page

curl -i localhost:3000/api/an-api-endpoint
curl -i localhost:3000/api/a-dynamic-api-endpoint/123
curl -i localhost:3000/api/an-api-endpoint-that-throws
  • "lib/instrumentation/modules/next/README.md" attempts to give an overview of how the instrumentation works. Reading "lib/instrumentation/modules/next/dist/server/next-server.js" is the main meat.
  • For testing, "test/instrumentation/modules/next/next.test.js" is the main thing to read.

Copy link
Member

@bmorelli25 bmorelli25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how these docs turned out. I only have a few suggestions.

docs/nextjs.asciidoc Show resolved Hide resolved
docs/nextjs.asciidoc Outdated Show resolved Hide resolved
docs/nextjs.asciidoc Outdated Show resolved Hide resolved
docs/nextjs.asciidoc Outdated Show resolved Hide resolved
docs/nextjs.asciidoc Outdated Show resolved Hide resolved
docs/nextjs.asciidoc Outdated Show resolved Hide resolved
docs/nextjs.asciidoc Outdated Show resolved Hide resolved
examples/nextjs/README.md Outdated Show resolved Hide resolved
examples/nextjs/README.md Outdated Show resolved Hide resolved
CHANGELOG.asciidoc Outdated Show resolved Hide resolved
There had been a issue in the TAV tests where the `npm ci` run as
part of a test run would *override* the `npm install --no-save ...`
changes that TAV was making. This resulted in older versions never
actually being tested. Fixing that uncovered some issues/limitations
in the tests and instrumentation for older versions back to the
min 11.1.0 supported.

- Improve de-dupe of captureError for earlier next versions that
  can call `renderErrorToResponse` twice.
- Handle a change in `findPageComponents` signature in v12.2.6-canary.10
- Change how the test flushes and stops the test Next.js server to
  not rely on graceful shutdown... because NEXT_MANUAL_SIG_HANDLE
  support was only added in v12.1.7-canary.7.
- Add instrumentation of the top-level "next/dist/server/next.js" to
  setFramework early enough to catch the first intake request. I didn't
  dig in all the way, but in [email protected] the lazy load of
  "next-server.js" seemed to have changed such that it came after
  cloud metadata fetching uncorked the first intake request.
- Adjust the test a-nextjs-app as required to work with Next.js 13
  per some of the points in https://beta.nextjs.org/docs/upgrade-guide
  (no `<a>` child of `<Link>`, no `next/future/image`).
- Skip a couple of test cases on some older 12.0.x and 11.1.x versions
  of next because it isn't worth it right now.
Copy link
Contributor

@astorm astorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems ok/good? I had a look over everything and left some questions but nothing seems out of line for a tech preview. I also ran through the basic docs (yay docs) and was able to get an example app up, running, instrumented with the agent and it produced reasonable transaction names.

I didn't go too deep into the specifics of which Next.js modules/methods we were instrumenting, so if there's something specific in the instrumented packages you think needs more scrutiny let me know, otherwise approved.

lib/instrumentation/index.js Show resolved Hide resolved
test/config.test.js Show resolved Hide resolved
@@ -0,0 +1,32 @@
# Test Next.js versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no objections, (and maybe it gets clearer if I get deeper into the review), but why this new .tav.yml file instead of using the main one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have an entirely solid answer. Mainly because it allows for clean separation of the package.json for this Next.js app. If this tav config was in the top-level .tav.yml file, then the dependencies:

  "dependencies": {
    "next": "^13.0.0",
    "react": "^18.2.0",
    "react-dom": "^18.2.0"
  }

would have to be in the top-level "devDependencies". In general that muddies the waters for understanding just this Next.js test case. Currently as it is you can work just in this test dir:

cd test/instrumentation/modules/next/a-nextjs-app
npm install
node ../next.test.js
npx tav

While doing (the long) dev on this PR that was helpful. And especially while learning Next.js it was helpful to be clear about exactly what deps were involved for the test app... and not the ~80 devDeps that we currently have at the top-level.

In this case there aren't any module conflicts but, for example, there are with the graphql-related modules (hence the npm uninstall ... commands in .tav.yml. I'm not proposing doing it soon, but I think it would be interesting to eventually consider a layout where there are multiple separate .tav.yml files -- perhaps one per instrumentation module.


You can also take a look at and use this https://github.com/elastic/apm-agent-nodejs/tree/main/examples/nextjs/[Next.js + Elastic APM example app].

NOTE: XXX Until this is merged, the example is at: <https://github.com/elastic/apm-agent-nodejs/tree/trentm/feat-nextjs/examples/nextjs/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have a checklist bullet for this in the description above.

npm install elastic-apm-node --save # or 'yarn add elastic-apm-node'
----

NOTE: XXX Until this is in a release use `npm install "elastic-apm-node@elastic/apm-agent-nodejs#trentm/feat-nextjs"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have a checklist bullet for this in the description above.

"start": "NODE_OPTIONS=--require=elastic-apm-node/start-next.js next start",
"lint": "next lint"
},
"XXX": "change to 'main' branch for merge, and issue to update this on release",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have a checklist bullet for this in the description above.

@trentm trentm merged commit 32c364f into main Oct 28, 2022
@trentm trentm deleted the trentm/feat-nextjs branch October 28, 2022 20:12
@apmmachine
Copy link
Contributor

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2022-10-28T20:05:26.118+0000

  • Duration: 24 min 40 sec

Test stats 🧪

Test Results
Failed 0
Passed 311021
Skipped 0
Total 311021

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run module tests for <modules> : Run TAV tests for one or more modules, where <modules> can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules

  • run benchmark tests : Run the benchmark test only.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

fpm-peter pushed a commit to fpm-git/apm-agent-nodejs that referenced this pull request Aug 20, 2024
This adds instrumentation of the Next.js dev server (`next dev`) and prod
server (`next start`). The APM transactions for incoming HTTP requests to the
server will be named appropriately based on Next.js's routing -- both for
user page routes (e.g. `GET /a-dynamic-page/[id]`) and for internal Next.js
routes (e.g. `Next.js _next/data route my-page`,
`Next.js Rewrite route /foo -> /bar`). As well, exceptions in server-side code
(e.g. `getServerSideProps`, server-side run page handlers, API handlers) will
be reported.

This is a "technical preview", meaning we might change how this works in
future versions -- including dropping the separate "start-next.js" start
module.

This also makes a change to stacktrace reporting. If a stack frame refers to
a "*.min.js" file path and does not have source-map information, it is assumed
to be a minimized file and source file context is *not* reported. This is to
avoid excessively large stacktrace data. Reporting source file lines for
name-minimized and 500+ character lines is not useful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-nodejs Make available for APM Agents project planning.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for nextjs
4 participants