Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent Modification exception while importing same type vertices using multiple thread and buckets with thread strategy #1640

Open
RobertoSannino opened this issue Jun 26, 2024 · 3 comments
Assignees

Comments

@RobertoSannino
Copy link

RobertoSannino commented Jun 26, 2024

ArcadeDB Version:

docker image arcadedata/arcadedb:24.5.1

OS and JDK Version:

MacOS 14.2.1 (23C71)
Docker 4.17.0
Java 17
gremlin-driver:3.7.2
arcadedb-engine:24.5.1

Expected behavior

Importing vertices of type Course_offerings with N buckets and thread strategy using N threads should work without throwing
org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Concurrent modification on page PageId(1/4) in file 'Course_offerings_0.1.65536.v0.bucket' (current v.5 <> database v.6). Please retry the operation (threadId=70)
or
org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Record #10:10328 not found

Actual behavior

One thread is throwing the Concurrent modification exception while the others the Record not found exception

Steps to reproduce

SQL script

CREATE VERTEX TYPE Course_offerings IF NOT EXISTS BUCKETS 10;
ALTER TYPE Course_offerings BucketSelectionStrategy `thread`;

Gremlin query performed by the threads (each thread can perform the query multiple times for the Course_offerings type)

g.inject(rows).unfold()
        .mergeV(select('pk'))
            .option(Merge.onMatch, select('properties'))
            .option(Merge.onCreate, select('properties'));

rows parameter example:

test = [
  [pk: [(T.label): 'Course_offerings', id:1], props: [title:'X']], 
  [pk: [(T.label): 'Course_offerings', id:2], props: [title:'Y']]
];

Java Test to reproduce

```
void testAddV() {
    tinkerpopClient = ...
    int nOfThreads = 8;
    int batchSize = 100;
    int nOfProperties = 8;

   tinkerpopClient.submit("g.V().hasLabel('Imported').drop().iterate()").all().join();
    try (RemoteDatabase database = ...) {
        database.transaction(() -> {
            database.command("sql", "DROP TYPE Imported ");
            database.command("sql", "CREATE VERTEX TYPE Imported IF NOT EXISTS BUCKETS " + nOfThreads);
            database.command("sql", "ALTER TYPE Imported BucketSelectionStrategy `thread`");
        });
    } catch (Exception e) {
        e.printStackTrace();
    }

    String query = """
        g.inject(rows).unfold()
        .mergeV(select('pk'))
            .option(Merge.onMatch, select('properties'))
            .option(Merge.onCreate, select('properties'));
        """;

    class Importer implements Callable{
        private String id;

        public Importer(String id) {
            this.id = id;
        }

        @Override
        public Object call() {
            List<Map<String, Object>> queryInputParams = createQueryInputParams(batchSize);
            Map<String, Object> params = Map.of("rows", queryInputParams);
            System.out.println(Thread.currentThread().getName() + " Importing " + batchSize + " entries");
            int nOfResults = tinkerpopClient.submit(query, params).all().join().size();
            return Thread.currentThread().getName() + " -> " + nOfResults;
        }

        private List<Map<String, Object>> createQueryInputParams(int batchSize) {
            List<Map<String, Object>> queryInputParams = new ArrayList<>();
            for (int i = 0; i < batchSize; i++) {
                Map<String, Object> row = new HashMap<>();
                Map<Object, Object> pk = new HashMap<>();
                Map<Object, Object> prop = new HashMap<>();
                row.put("pk", pk);
                row.put("properties", prop);
                pk.put(T.label, "Imported");
                pk.put("id", System.currentTimeMillis());
                for (int j = 0; j < nOfProperties; j++) {
                    prop.put("p" + j, j);
                }
                queryInputParams.add(row);
            }
            return queryInputParams;
        }
    }

    ExecutorService executorService = Executors.newFixedThreadPool(nOfThreads);
    ExecutorCompletionService<Object> completionService = new ExecutorCompletionService<>(executorService);
    for (int i = 0; i < nOfThreads; i++)
        completionService.submit(new Importer(String.valueOf(i)));
    int receivedResults = 0;

    while (receivedResults < nOfThreads) {
        try {
            Future<Object> future = completionService.take();
            System.out.println("Results from " + future.get().toString());
            receivedResults++;
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        } catch (ExecutionException e) {
            throw new RuntimeException(e);
        }
    }

    if (!executorService.isTerminated()){
        try {
            executorService.shutdownNow();
            boolean terminated = false;
            int retry = 0;
            while (!terminated && retry < 5) {
                executorService.awaitTermination(retry, TimeUnit.MINUTES);
                terminated = executorService.isTerminated();
                retry++;
            }
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException(e);
        }
    }
}
@lvca
Copy link
Contributor

lvca commented Jun 26, 2024

How do you execute the query? Remotely via Gremlin Server, through ArcadeDB's HTTP protocol or Postgres?

@lvca lvca self-assigned this Jun 26, 2024
@RobertoSannino
Copy link
Author

RobertoSannino commented Jun 27, 2024

The gremlin query are performed using the java gremlin driver trough the Client.submit(query, params) method.

I've also tried to use the java arcadeddb client transaction then command method, but the T.label input (org.apache.tinkerpop.gremlin.structure) is not parsed correctly and throws the following exception
com.arcadedb.remote.RemoteException: Error on executing remote operation command (cause: class org.apache.tinkerpop.gremlin.structure.T$1 cannot be cast to class java.lang.String

Furthermore, by giving the input match label as a string ("T.label ") it is then not interpreted correctly by the remote server that tries to create vertex_types instead of Course_offerings

Even performing ALTER DATABASE arcadedb.typeDefaultBuckets 16 before running the query does not work for vertex_types (it seems to me that is not possible to perform an ALTER DATABASE arcadedb.bucketSelectionStrategy thread to perform the test )

@RobertoSannino
Copy link
Author

I've updated the question with a java test to reproduce the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants