-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC-2398-Schema Change operation #222
base: 4.1
Are you sure you want to change the base?
DOC-2398-Schema Change operation #222
Conversation
== Schema Change Operations Best Practices | ||
|
||
Schema changes are essential for maintaining and improving your graph's structure. However, they must be done carefully to avoid impacting ongoing operations. | ||
Following best practices minimizes the risk of failures and ensures smooth execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check your grammar
* *Do ensure no active queries or loading jobs.* Run schema changes only after aborting ongoing jobs that might reference schema components being modified. | ||
* *Do perform schema changes during low-traffic periods.* This reduces the chances of failure and conflicts during schema modification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem to be contradictory rules.
Rule 1 says no queries or loading jobs.
Rule 2 says "low traffic". What type of traffic is allowed? It also does't answer the question of "how much is 'low'?"
|
||
*Verifying Schema Consistency:* | ||
|
||
* Use the `gsql` commands to check the schema version after the schema change. This will confirm that the schema updates have been applied correctly, and that no unintended changes have occurred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What GSQL commands?
@Tushar-TG-14 If you don't know the answer to this question, then the reader won't know either.
Schema changes will invalidate queries and loading jobs dependent on modified schema components. | ||
This may cause errors or failures in queries trying to reference altered or removed schema elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deepkhajanchi What does "invalidate" mean in this context?
Does it mean "the query and/or loading job" has been "uninstalled" (A query can be uninstalled, but I don't know about loading jobs.)
Or does it mean "the user can still run it, but it might be broken and not work correctly."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can/should be more specific about the rules for when a query/job will be invalid.
* A loading job becomes invalid if it refers to a vertex or and an edge which has been *dropped* (deleted) or *altered*. | ||
* A query becomes invalid if it refers to a vertex, and edge, or an attribute which has been *dropped*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should always be careful to distinguish between individual vertices/edges vs. vertex/edge types.
Loading jobs refer to vertex/edge types; a job will be invalid if it refers to a type or an attribute of a type that no longer exists.
A query usually refers to types but in theory could refer to individual vertices and edges. I think this rule is meant to also refer to types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2nd comment:
Use "DROP" and "DELETE" properly.
In SQL and GSQL, these two keywords have distinct meanings. Therefore, you should never say one is a synonym of the other.
DROP is to remove a catalog item. Catalog items include vertex TYPES, edge TYPES, graphs, queries, and loading jobs. [In a relational database, you could DROP a table.]
DELETE is to remove a data object from the database - a vertex or edge. [In a relational database you could DELETE a row of a table.]
|
||
*Verifying Data Consistency:* | ||
|
||
* Run the `gstatusgraph` command to check that the data size remains consistent with what was recorded before the schema change. This will ensure that no data loss or corruption has occurred during the schema update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data loss and data corruption are some of the worst errors that a database can do. Thing about it.
- Shouldn't we recommend that users perform a backup before performing a schema change?
- Check other database products to see if/when they mention "data loss", "data corruption", or "data inconsistency". We want to see how they mention this.
Compare these two: - "It is essential to make sure that there are no loading jobs in progress when you start the schema change. Failure to follow this rule may result in data corruption." (It's the user's fault.)
- "Schema changes sometimes result in data corruption." (It's TigerGraph's fault.)
No description provided.