Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perhaps fix an overflow bug relate to timer #426

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

yulincoder
Copy link

@yulincoder yulincoder commented Jul 14, 2018

Hello, I find a bug relates to timer system and try to fix it, which causes overflow task to execute and all of the other tasks to fail. The system will crash and the battery is quickly exhausted (not "slightly higher runtime cost" in the comments of code author in the function because 'remaining' will always be negative and cannot be corrected). The following code can trigger bugs,

call Timer2.startPeriodic( 111UL << 28);  

then you can see that the node is always running at full capacity, it will run the timer task with bugs forever.
Although the code author noticed the overflow problem, he/she did not solve it.
This problem appears to have been triggered in an early CTP protocol, but it has not really been resolved in underlying. [ 11ff964 ]

The idea my fix is to disable the timer that triggered the bug so that timer with the bug can not influent each other task. In this case, other tasks in the system can handle events normally, and the battery won't be drained quickly.

@yulincoder
Copy link
Author

Any review?

@cire831
Copy link
Member

cire831 commented Jul 23, 2018

I could have sworn I asked what machine you are building this for? But don't see any reference to it.

So what processor are you building this for?

I've looked at the code and don't understand why your code doesn't work. More investigation into the root cause is warranted.

@yulincoder
Copy link
Author

yulincoder commented Jul 23, 2018

@cire831 I am sorry to close another pull request so that did not notice your comment.
I have been building TinyOS on TelosB platform.
The cause of system crash is the following:

  1. In the updateFromTimer task, the timer->dt may be a big value as unsigned int in some case, and the elapsed is not a big enough value. In this case, the value of remaining in int32_t remaining = timer->dt - elapsed; is a negative value forever because of the type conversion from uint32_t( timer->dt is a uint32_t variable) to int32_t.
  2. As mentioned above, the following conditions are always satisfied. So the fireTimers will be always executed
if (min_remaining <= 0)
	  fireTimers(now);
  1. Sadly, the updateFromTimer is posted in the function of fireTimers. The system caught up in this endless cycle:
    updateFromTimer ---> fireTimers --->post updateFromTimer ---> updateFromTimer ---> fireTimers --->post updateFromTimer ---> .....
    It results in the system runs at full capacity and other tasks in system are difficult to perform properly until the battery runs out.

@cire831
Copy link
Member

cire831 commented Jul 24, 2018

okay. I'll see what I can do. I'll take a look at it in the next few days.

@yulincoder
Copy link
Author

@cire831 Do you have a good fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants