Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #12771: Progress value is not increased #6993

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

ludviggunne
Copy link
Contributor

No description provided.

@ludviggunne ludviggunne changed the title fix #12771 fix #12771: Progress value is not increased Nov 11, 2024
lib/importproject.cpp Outdated Show resolved Hide resolved
@firewave
Copy link
Collaborator

I have similar code which deals with the language. I guess the basic logic can be re-used/combined from the changes.

@firewave
Copy link
Collaborator

This also needs tests for the cases the progress was not correctly reported.

@ludviggunne
Copy link
Contributor Author

I have similar code which deals with the language. I guess the basic logic can be re-used/combined from the changes.

Hm, I'm not sure what you mean, do you have a PR for it?

@firewave
Copy link
Collaborator

Hm, I'm not sure what you mean, do you have a PR for it?

Not yet - #6978 is in preparation of the change I referred to. But I am not too sure about it - the context-less usages (i.e. no file, no language) which will only lead to issue with specific usage are giving me headaches. Maybe I use your use case to proceed with this.

cli/processexecutor.cpp Outdated Show resolved Hide resolved
test/cli/proj2_test.py Outdated Show resolved Hide resolved
@@ -106,7 +108,7 @@ class ThreadData
if (mItNextFileSettings != mFileSettings.end()) {
file = nullptr;
fs = &(*mItNextFileSettings);
fileSize = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure we always fetch the filesize we probably should initialize/set it to -1 everywhere and valid that later on. Can be done at a later date though.

lib/importproject.cpp Outdated Show resolved Hide resolved
@firewave
Copy link
Collaborator

firewave commented Nov 20, 2024

FYI when we need to test something related to inputs there is three different cases to consider:

  • files specified via CLI
  • files specified via a .cppcheck project (basically emulates the CLI options but sometimes has its own logic)
  • any other project (for ease that would be a compilation database)

@ludviggunne
Copy link
Contributor Author

FYI when we need to test something related to inputs there is three different cases to consider:

Okay, thanks! I'll be focusing on other tickets for a while so I'll pick this up at a later time.

@firewave
Copy link
Collaborator

I'll be focusing on other tickets for a while so I'll pick this up at a later time.

Don't end up like me... 🙄

@firewave
Copy link
Collaborator

I would prefer to have the code which gets the filesize in path.cpp.

@ludviggunne
Copy link
Contributor Author

ludviggunne commented Dec 22, 2024

I noticed the percentage reported in SingleExecutor for FileSettings is based on the number of files instead of the file sizes, is this intentional?

reportStatus(c, mFileSettings.size(), c, mFileSettings.size());

@firewave
Copy link
Collaborator

I noticed the percentage reported in SingleExecutor for FileSettings is based on the number of files instead of the file sizes, is this intentional?

No idea - I would need to look at the history.

But I wonder why we calculate it based on on the file size anyways. That makes no sense because we are using the size of the actual input file. A file with just a few lines itself might end up with much more code which is being analyzed. So that would only make sense if we used the size of preprocessed code.

I think we should just consider the amount of files.

@ludviggunne
Copy link
Contributor Author

I think we should just consider the amount of files.

That makes sense. I'm not sure how the workflow would look for that, should it be it's own ticket?

@danmar
Copy link
Owner

danmar commented Dec 27, 2024

But I wonder why we calculate it based on on the file size anyways. That makes no sense because we are using the size of the actual input file. A file with just a few lines itself might end up with much more code which is being analyzed. So that would only make sense if we used the size of preprocessed code.

Originally it was number of files..

Then at some point we were discussing some project in the chat. I was running Cppcheck on the project. I had checked a number of the files. So the report said "X% done". And in the chat I said something like "I have checked X% already but well there are big files remaining so probably it's going to take longer time."
Then somebody in the chat suggested that we would base "X% done" on the file sizes.

It's still a very rough estimate. As you pointed out it depends a lot on what headers are included. And it can depend a lot on how many branches etc there are. A huge file with trivial code can run quickly.

I don't have super strong opinion.. but generally.. based on number of files the guess will be off more imho. A small file would generally go faster to analyse than a large file.

@danmar
Copy link
Owner

danmar commented Dec 27, 2024

A small file would generally go faster to analyse than a large file.

That is my "best guess".. I can't prove it. :-)

@ludviggunne
Copy link
Contributor Author

I don't have super strong opinion.. but generally.. based on number of files the guess will be off more imho. A small file would generally go faster to analyse than a large file.

Would it be possible to base it on the length of the token list?
Anyway, I suppose we can stick to filesizes in this PR?

@danmar
Copy link
Owner

danmar commented Dec 27, 2024

Would it be possible to base it on the length of the token list?

nah.. I believe it would be more accurate but that would increase the analysis time. we would need to open all the files and count all the tokens to know the total number of tokens.

@firewave
Copy link
Collaborator

The value is arbitrary and inaccurate either way - so I would go with the simpler implementation i.e. file count only. It would also match the default ('-j1') behavior.

@ludviggunne
Copy link
Contributor Author

I guess we actually have both right now, with 1/3 files checked 42%.

@firewave
Copy link
Collaborator

I guess we actually have both right now, with 1/3 files checked 42%.

Fair enough.

test/cli/proj2_test.py Outdated Show resolved Hide resolved
@ludviggunne ludviggunne marked this pull request as ready for review January 8, 2025 16:38
@@ -245,6 +245,9 @@ bool CmdLineParser::fillSettingsFromArgs(int argc, const char* const argv[])
if (mSettings.library.markupFile(fs.filename())) {
fs.file.setLang(Standards::Language::C);
}
std::string error = fs.updateFileSize();
if (!error.empty())
mLogger.printError(error);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you use mLogger.printError then it feels like we should abort. But I think it's unfortunate to abort if there are missing files. Could we write some proper cppcheck warnings during the analysis instead if a file cannot be opened.

How about just setting filesize to -1 or something:

  • when total file size is calculated ignore files that have size -1
  • during analysis ensure there is some proper cppcheck message when a file can not be opened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the files can just be removed from the list (after showing a warning)? That way we don't have to care about it when calculating the total file size.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that error happens we should bail out - otherwise we would silently (nobody cares about the warnings because they are not visible in a headless system - yet) drop a file from the analysis which might cause issues to slip through or lead to inconsistent results between runs.

This should only happen if there are filesystem issues or somebody is modifying the data. You should avoid that. e.g. in the CLion IDE integration I pass a fixed temporary file to the analysis to avoid that a file which might constantly being changed is being read.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If mLogger.printError(error); is used and Cppcheck just aborts then I see the risk that Cppcheck plugins will silently just fail and the user get no proper information. For instance the vscode plugin does not show the Cppcheck printError output it just does not recognize its output and aborts. It just says "Failed to execute Cppcheck" or something like that.

nobody cares about the warnings because they are not visible in a headless system - yet

I don't know what you talk about. It would be pointless to run Cppcheck at all if nobody cares about the warnings. I want that we report proper Cppcheck warning output so the plugins have a good chance to show the output.

This should only happen if there are filesystem issues or somebody is modifying the data.

Yeah sure.. but we do run into this sometimes. And it would be nice to be able to scan a project even if some files are autogenerated during build or whatever..

@firewave
Copy link
Collaborator

firewave commented Jan 9, 2025

I guess we actually have both right now, with 1/3 files checked 42%.

Fair enough.

I still think we should drop the file size percentage. Looking at how much time and discussion has already been put into this and it is something a high percentage of the users will probably never see as they use headless systems (i.e. CI/IDE integrations). It seems the time might be better spent on something else than making an arbitrary value arbitrary in a different way...

@danmar
Copy link
Owner

danmar commented Jan 10, 2025

I still think we should drop the file size percentage

I don't want to drop it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants