-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve fromBodyXML to convert Spark's kitchen sink. #58
Conversation
The test fixture we were using came from Spark - however C&M do various transformations which mean the bodyXML we publish from Spark isn't necessarily the same as the bodyXML that a user would read from the API. For the purpose of this transformer, we are interested in the latter, so this change updates the test fixture file to be the published version of the bodyXML. I have also taken the expected JSON from the bodyTree produced by cp-content-pipeline-api (removing the "data" blocks that contain referenceIds). There may be other places where the CP-generated tree is not actually valid, but will deal with that as we go along.
This introduces a new internal type __LIFT_CHILDREN__, which is used to indicate that we don't want this node, but want its' children. Examples: <experimental> <div class="n-content-layout__container">
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought (non-blocking): I have just a couple of thoughts if useful, nothing closely related with the code of the PR.
I am also thinking that the starting format of this transformer should be the bodyXML
format returned by the Internal Content API. That way the CP implementation and this one will use the same starting point.
The biggest semantical difference between the bodyXML
format that Spark publishes and the one returned from the Internal Content API are the opaque
and translucent
namespaces. However, I think they will play role in the complexity of the other transformer (content tree -> external bodyXML), it should be an issue for this one.
I really don't have enough knowledge to weight on the CP implementation. Is it possible to build a testing strategy that makes sure that this implementation builds exactly the same Content Tree as the CP implementation (do we want this actually)? For example, we iterate over lots (all) uuids and compare.
I like that idea! Maybe a test that runs in Circle periodically that:
|
Spark has a kitchen sink article, which has most of the types of content produced by Spark
*
The
from-bodyxml
library in this repository was originally written to work with simple Sustainable Views articles, and so did not have the transformers required to convert this kitchen sink. There was a failing test in the test suite.The failing test was actually using XML from Spark - which represents what Spark publishes, and not what is available in the public C&M APIs. I think when we use this library, we would be using it with the public API representation (🤔 ), so I updated the fixtures to use that. I also updated the expected output to be exactly what is returned by FT.com's API, as this is what we're currently serving users.
Then, systematically went through and fixed stuff until the tests pass.
**Question: ** should we just use the existing implementation of this from CP.
Pro: it already exists, and has been battle-tested in the real world, so we can be confident it works on all real articles.
Con: it might need some decoupling from CP stuff. they also have a bunch of
Workarounds
- but presumably we need a strategy for those anyway for the migration?!*
There are a few things that it does not have: