Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop dreyfus_index processes on ddoc update #21

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mayya-sharipova
Copy link
Contributor

Overview

Currently when ddoc is modified, dreyfus_index and dreyfus_index_updater
processes corresponding to the previous version of ddoc will keep running
until all indexing processing initiated by them is done.
When ddoc of a big database is rapidly modified, this puts a lot
of unnecessary strain on database resources.

This commit brings the following changes:

  1. When opening a dreyfus_index, always add a record
    {DbName, {DDocId, Sig}} to ?BY_DB.
  2. When ddoc_updated, check if there other ddocs in ?BY_DB with the same Sig.
    If there are other, only remove {DbName, {DDocId, Sig}}
    record from ?BY_DB for this ddoc.
    If there are no, stop dreyfus_index processes:
    • all dreyfus_index processes for the prev. version of ddoc will be shutdown
    • all linked to them dreyfus_index_updater processes will die as well
    • all waiters for indexing activity to be finished will receive an immediate
      reply: ddoc_updated. Interactive query requests will get response:
      {404, <<"not_found">>, <<"Design document was updated or deleted.">>}

Testing recommendations

make check apps=dreyfus tests=ddoc_update_test_

GitHub issue number

This is to rewrite: #16
To follow the same style as in: apache/couchdb@2e7ca45

BugzID: 85718

Currently when ddoc is modified, dreyfus_index and dreyfus_index_updater
processes corresponding to the previous version of ddoc will keep running
until all indexing processing initiated by them is done.
When ddoc of a big database is rapidly modified, this puts a lot
of unnecessary strain on database resources.

This commit brings the following changes:
1. When opening a dreyfus_index, always add a record
   {DbName, {DDocId, Sig}} to ?BY_DB.
2. When ddoc_updated, check if there other ddocs in ?BY_DB with the same Sig.
    If there are other, only remove {DbName, {DDocId, Sig}}
    	record from ?BY_DB for this ddoc.
    If there are no, stop dreyfus_index processes:
    * all dreyfus_index processes for the prev. version of ddoc will be shutdown
    * all linked to them dreyfus_index_updater processes will die as well
    * all waiters for indexing activity to be finished will receive an immediate
	reply: ddoc_updated. Interactive query requests will get response:
	{404, <<"not_found">>, <<"Design document was updated or deleted.">>}

BugzID: 85718
end
end, ets:match_object(?BY_DB, {DbShard, {DDocId, '_'}}))
end, DbShards),
{ok, nil};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is confusing to return nil. I think it is better to return St as we do in the next clause.

handle_cast(_Msg, State) ->
{noreply, State}.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace

@@ -214,8 +234,14 @@ handle_info({'DOWN',_,_,Pid,Reason}, #state{
[gen_server:reply(P, {error, Reason}) || {P, _} <- WaitList],
{stop, normal, State}.

terminate(_Reason, _State) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rewrite this as

terminate({shutdown, ddoc_updated}, State) ->
    Waiters = State#state.waiting_list,
    [gen_server:reply(From, ddoc_updated) || {From, _} <- Waiters];
termiante(_Reason, _State) ->
    ok.


handle_cast({ddoc_updated, DDocResult}, #state{} = State) ->
#index{sig = Sig} = State#state.index,
KeepIndex = case DDocResult of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called KeepIndexes? If I understand correctly, if one index in the new design doc has the same signature, then we don't kill the existing indexes. So if I had 1000 indexes, and I update 999 of them with a new definition (and thus a new signature), but only 1 still was unchanged, then that would mean 999 old indexes would still keep running right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonysun83 KeepIndex here means if we should keep the current dreyfus_index process or kill it. It refers to a current single dreyfus_index process. May be a better name could be KeepIndexProcess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonysun83 The intention is to shutdown all changed indexes, so in your case all 999 old indexes will be killed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mayya-sharipova : understand now, had to re-read it again, not I got it

@@ -86,6 +90,9 @@ handle_call({open_error, DbName, Sig, Error}, {OpenerPid, _}, State) ->

handle_cast({cleanup, DbName}, State) ->
clouseau_rpc:cleanup(DbName),
{noreply, State};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we add this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonysun83 Are you referring to the line 93? If yes, we did not add this here, it was here before - this a standard way to finish handle_cast. This line has changed as we added a new handle_cast clause.

@tonysun83
Copy link
Contributor

tonysun83 commented Sep 5, 2017

Looks pretty good. I saw in the tests that deletion was the method to see if index processes were killed or updated. Can one we add one test where we add a design doc with two index names, then update the definition for one index, and check to see that the old process with the old signature is killed? That coverage would be great.

@tonysun83
Copy link
Contributor

LGTM, +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants