Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaps in Theia's instrumentation #30

Open
MatthewKhouzam opened this issue Mar 19, 2024 · 5 comments
Open

Gaps in Theia's instrumentation #30

MatthewKhouzam opened this issue Mar 19, 2024 · 5 comments

Comments

@MatthewKhouzam
Copy link
Contributor

When Theia's tracing is loaded it shows performance improvements, but also large areas of "unknowns"

Screenshot from 2024-03-19 14-51-53
See above pic in Trace Compass.

It would be good to add instrumentation for the areas with no info. This will show where the big wins can be had in terms of performance optimization.

@JonasHelming
Copy link
Contributor

@planger

@planger
Copy link
Contributor

planger commented Mar 26, 2024

@MatthewKhouzam Thank you for raising this issue! We'll plan in some time in the upcoming weeks to investigate this instrumentation gap.

@tortmayr
Copy link
Contributor

@MatthewKhouzam I have started looking into this. Unfortunately I'm having issues with the get_trace.py script.
The recent addition of electron-based tests also caused some structural change and the script no longer works for me.

Running it as is results in

python3 get_trace.py
Cloning into 'theia-e2e-test-suite'...
remote: Enumerating objects: 24090, done.
remote: Counting objects: 100% (2052/2052), done.
remote: Compressing objects: 100% (763/763), done.
remote: Total 24090 (delta 1619), reused 1716 (delta 1288), pack-reused 22038
Receiving objects: 100% (24090/24090), 16.57 MiB | 8.85 MiB/s, done.
Resolving deltas: 100% (19462/19462), done.
Branch 'gh-pages' set up to track remote branch 'gh-pages' from 'origin'.
Switched to a new branch 'gh-pages'
 33%|█████████████████████████████████████████████████████████████▋                                                                                                                           | 1/3 [00:00<00:00, 4583.94it/s]
Traceback (most recent call last):
  File "/home/tobias/Git/OpenSource/theia/theia-e2e-test-suite/scripts/convert-to-trace-event/get_trace.py", line 68, in <module>
    with open(input_trace, 'r', encoding='utf8') as content:
IsADirectoryError: [Errno 21] Is a directory: 'performance/chromium'

When changing the FOLDER_PATH to the corrected new value (performance/chromium or performance/electron) I get the following error:

 python3 get_trace.py
Cloning into 'theia-e2e-test-suite'...
remote: Enumerating objects: 24090, done.
remote: Counting objects: 100% (2052/2052), done.
remote: Compressing objects: 100% (763/763), done.
remote: Total 24090 (delta 1619), reused 1716 (delta 1288), pack-reused 22038
Receiving objects: 100% (24090/24090), 16.57 MiB | 2.03 MiB/s, done.
Resolving deltas: 100% (19462/19462), done.
Branch 'gh-pages' set up to track remote branch 'gh-pages' from 'origin'.
Switched to a new branch 'gh-pages'
  0%|                                                                                                                                                                                                | 0/1821 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/tobias/Git/OpenSource/theia/theia-e2e-test-suite/scripts/convert-to-trace-event/get_trace.py", line 108, in <module>
    with open(output_trace, "w", encoding='utf8') as trace_file:
FileNotFoundError: [Errno 2] No such file or directory: '../traces/2024-1-13T5-9-9_6.json'

Any ideas on how to fix that?

@tortmayr
Copy link
Contributor

tortmayr commented May 2, 2024

After further investigating this, it turns out that the problem is not really related to an actual gap in our instrumentation.
(Although there are certainly some parts that could benefit from more detailed measurements)

The main problem is that the the trace conversion treats the time information provided by the measurements as absolute values.
However, they are in fact relative measurements (relative to the startup point of the corresponding process).

For instance consider the following example:

  1. Start browser example backend (yarn start browser:debug)
  2. Wait for 20 seconds
  3. Open frontend (http:localhost:3000)

If we now have a look at the measurements in the metrics report eg:

theia_measurements{id="backend", name="DefaultMessagingService.initialize", startTime="290.0829919576645", owner="backend"} 15.945707023143768
...
theia_measurements{id="frontend-1", name="TabBarDecoratorService.initialize", startTime="1149.5", owner="frontend"} 1.399999976158142

Currently the traces for this are placed at 290,09 and 1149.5 ms. However, the measurement for the frontend is relative to its startup so to correctly place it we would have to consider the actual starting point of the frontend and use it as an offset. So assuming the 20 sec offset from the example above the correct placement for the frontend event (TabBarDecoratorService.initialize) would be 21149.5.

To sum up: backend traces are placed correctly, but frontend traces need to be adapted to incorporate the actual frontend startup time.

We somehow need to provide this information in the metrics report so that it can be used as an anchor point.
For instance, it might be possible to hook into the get request fro the frontend url in the backend and log a measurment for the frontend startup before the request is resolved.

Alternatively, we could also look into the current Stopwatch API. It might be a could idea to include startTimes based on the current system time instead of relying soley on relative measurements.

@MatthewKhouzam
Copy link
Contributor Author

Hey, I'm really sorry, this flew under the radar. This is very good information. I can easily update the script. I will test over the weekend! Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants