Replies: 20 comments
-
Hey @tido64 that sounds like a great idea! I'll be working more closely on the topic of benchmarking & tooling from April but until then I think that starting this conversation is 🔥 I think that we should try to come up with at least a baseline of what is "acceptable" (ex. 200ms for a full-round from click to visual change), maybe on a range of different devices (ex. android+old, android+new, ios+old, ios+new). The one benchmark that I can think of missing here is the "app bundling" benchmark, which could underline suboptimal (or improvements) of the native-side settings of the mobile app project; at least, this was an area I managed to find a few easy wins the last times I've done some performances improvements for a prod app I was working on -> https://gist.github.com/kelset/2eb61161a68d1ab35337d1eaa9a05e78 |
Beta Was this translation helpful? Give feedback.
-
Was totally forgetting, but @Kudo did a lot of work in the benchmarking space -> https://github.com/Kudo/react-native-js-benchmark |
Beta Was this translation helpful? Give feedback.
-
@kelset That's great to hear. Do you have some sort of plan for what you want to do? Regarding the baseline, I like the idea of having goals. I was thinking more of having a diff against @Kudo How much work do you think it will take to bring this to iOS as well? |
Beta Was this translation helpful? Give feedback.
-
I'm interested in this problem space and can help out 👍 |
Beta Was this translation helpful? Give feedback.
-
So the plan is a bit hand-wavey still but I'll be working 6 weeks full time on a react-native performances open source book (both from perspective of how to measure, finding the baselines, bottlenecks, etc and how to act upon those). So for your proposal I think that I'll be able to help you out with the sides of baseline & benchmarking (at least). But I'll only start in April AFAIK.
Just for me to better understand - with diff you mean like a report of the "delta" between commits? Like, "this commit has made the TTI 30 ms slower"?
So, by app bundling I mean the process that both iOS and Android need to perform to generate the .ipa/.apk. I think that by keeping an eye on this too we could make sure that whenever native-side configurations are touched, they are not leading to less optimal processes (ex. let's say a commit changes the compiler optimization, if we check the build time we could detect if it's actually improving the overall time VS making it more slower, etc). Another side effect of this is that it would help detect easy wins in terms of the native-side project configuration: let's say that we never realised that we can use the App Bundling split feature of Android (can't recall the precise name of it atm, the one that only bundles for the precise architecture)(I may be missremembering completely btw, my brain is ded atm). By having a baseline (ex. a fresh new Android app takes 2 mins to generate the apk) we could detect that our config, because it was first created let's say 3 years ago, it's x10 slower because we didn't update that particular config to take advantage of the new native side tooling. And the side effect of this is that then the app will be built more efficiently for the device it will run against, leading to then better "in-app" perfs. |
Beta Was this translation helpful? Give feedback.
-
Correct. The baseline should the branch you're merging against, most commonly
I'm not sure if this makes sense. If we misconfigured React Native to be built with with |
Beta Was this translation helpful? Give feedback.
-
I see - makes sense! Ok then let's leave the app bundling out of this... there's plenty already anyway ;) |
Beta Was this translation helpful? Give feedback.
-
I've added a table of tasks and estimates. Let me know if I missed anything. Also feel free to assign yourself to any of them 😉 |
Beta Was this translation helpful? Give feedback.
-
Thanks for adding the table! The measurement that we get asked for a lot internally is what the “working set” is of React Native. All the people that ask for that are ex Microsoft :-). We aren’t fully sure how we are supposed to measure that, and there is no one better to help us get that set up than you! :-) I’d like to make sure that is a part of this as that will help us get long awaited answers for our VPs asking for it. |
Beta Was this translation helpful? Give feedback.
-
BTW, we have a couple of synth performance test in new Fabric infra, and we (I) plan to add a couple more that will cover the most performance-critical (hot paths) aspects of the UI part of the framework (prop parsing, diffing, layout, nodes cloning and so on). We use Google Benchmark for that. The challenging (unexplored) part of that is running that continuously on CI and running that on ARM CPUs to get actual results for architectures we care about (x86 is usually fast enough). |
Beta Was this translation helpful? Give feedback.
-
@shergin That's awesome! How much work is it to get them pushed to the open source repo? I think it's a good starting point regardless. We can figure out how to get non-x86 results after we have them running continuously. I've added a task for it. @TheSavior Updated the table. I think we have a team internally who have solved this. I can hear with them. |
Beta Was this translation helpful? Give feedback.
-
react-native-js-benchmark currently focused at JS engine comparison but I think it could support to change as CI regression monitoring. For iOS support, the most difficult part is about running on real devices including the developer certificates and code signing. |
Beta Was this translation helpful? Give feedback.
-
From the MS side, we're going to look into funding CI using AppCenter which has all the real devices we would need for iOS and Android. |
Beta Was this translation helpful? Give feedback.
-
Please let me clarify if my understanding is correct. I had a similar work which to use RN-cli to generate a pure project and patch project to test my V8 integration. |
Beta Was this translation helpful? Give feedback.
-
@Kudo, I think it makes sense to just report the numbers in the first iteration. We can add diffing in the next iteration. @sweggersen had some ideas around how to do this. |
Beta Was this translation helpful? Give feedback.
-
Got it, will start to scratch some works accordingly. |
Beta Was this translation helpful? Give feedback.
-
This is really good idea! Looks like the nightly builds are already taken, but if you @sweggersen need any help from the CI/releases perspective, please let me know! My main question is how are we planning on broadcasting regressions that happen on master? PRs are one thing, but I think that the additional benefit is being able to track and double-test the diffs that are being synced from Facebook every once in a while. I am not sure if there's mechanism for that yet in place, but being able to notify early of CI failures (e.g. this commit has broken CI on master or this commit has caused certain regressions) could prevent situations where we are stuck with red CI for a long time and breakages just keep adding up. |
Beta Was this translation helpful? Give feedback.
-
What's the next step for the the bot that is posting about the app sizes? I'm excited for the future plans but as-is, it's a bit...loud. Is it possible to get the bot to delete all of its previous comments in a PR when it posts. Check out this PR for a particular example where since there are lots of changes, most of the comments are from the bot. facebook/react-native#27908 If it is going to be a while before the bot is able to start commenting with deltas which are a bit more actionable, does it make sense to temporarily disable the bot? I'm not sure the absolute numbers being shared on PRs are that helpful currently. |
Beta Was this translation helpful? Give feedback.
-
@TheSavior: We've just merged some improvements to the bot. It should now update its posts instead. Let me know if it's still too loud, and we'll figure something out. |
Beta Was this translation helpful? Give feedback.
-
I was busy for the past weeks and sorry not meet my original plan.
Thank for @grabbou to have the awesome plugin support in cli. |
Beta Was this translation helpful? Give feedback.
-
Introduction
We are currently blind to any regressions that don't directly translate to pixels on the screen. Performance regressions, in particular, isn't always straight forward to measure as different platforms and devices may yield wildly different results depending on how you do the measurements.
The Core of It
A suite of benchmarks and tools should be run on all PRs, and on every commit to
master
, to catch regressions (or improvements) compared to a baseline.What we should measure:
react-native init
app or RNTester.How do we report this?
master
) there should be a benchmark run to keep track of performance over time. The report should be publicly available, similar to the Firefox Performance Dashboard.Suggested Timeline
Discussion points
cc @sweggersen, @danrydholm
Beta Was this translation helpful? Give feedback.
All reactions