Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of product category rewrites (disabled product & categories and product_use_categories) #1405

Conversation

MarcinNowakMacopedia
Copy link
Contributor

@MarcinNowakMacopedia MarcinNowakMacopedia commented Jan 19, 2021

Description (*)

  1. I fixed issues when it was set not to use „Categories Path for Product URLs” (product_use_categories) in configuration (Catalog → Catalog → Search Engine Optimizations) and yet they can be found in URL Rewrite Management (commit bfe9329)
  2. I added field in configuration to set whether or not to generate rewrites for disabled categories and products (commit 9163a4d)

Related Pull Requests

Fixed Issues (if relevant)

  1. Fixes Magento is generating url rewrites for products for disabled categories #931
  2. Fixes Magento produces product category rewrites even if "Use Categories Path for Product URLs" is disabled #932

Manual testing scenarios (*)

Screenplay:

  1. Present situation
    a.) Set field product_use_categories in configuration to „yes”
    b.) Set field create_url_for_disabled in configuration to „yes”
    c.) In Index Management reindex „Catalog URL Rewrites”
    d.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite there are rows with set columns category_id and product_id.
    e.) In URL Rewrite Management check if there are rewrites for disabled products and categories.

  2. Products rewrites with categories path, but not for disabled products and categories.
    a.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite are row with set fields category_id and product_id.
    b.) In URL Rewrite Management check if there are rewrites for disabled products and categories.
    c.) Set field create_url_for_disabled in configuration to „no”
    d.) In Index Management reindex „Catalog URL Rewrites”
    e.) Check if rewrites checked in a. still exist.
    f.) Check if rewrites checked in b. were removed.

  3. Products rewrites with categories path, but not for disabled categories.
    a.) Choose active category.
    b.) In URL Rewrite Management check if there are rewrites for products with this category path, or in DB check if in table core_url_rewrite there are rows with set columns product_id and this category id in category_id column.
    c.) In URL Rewrite Management check if there are rewrite for this category.
    d.) Set field create_url_for_disabled in configuration to „no”.
    e.) Disable chosen category.
    f.) Check if rewrites checked in b. and c. were removed.

  4. Product rewrites with categories path, but not for disabled products.
    a.) Choose active product.
    b.) In URL Rewrite Management check if there are rewrites for this product.
    c.) Set field create_url_for_disabled in configuration to „no”.
    d.) Disable chosen product.
    e.) Check if rewrites checked in b. were removed.

  5. Product rewrites without categories path and for disabled products and categories.
    a.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite there are rows with set columns category_id and product_id.
    b.) In URL Rewrite Management check if there are rewrites for disabled products and categories.
    c.) Set field product_use_categories in configuration to „no”
    d.) In Index Management reindex „Catalog URL Rewrites”
    e.) Check if rewrites checked in a. were removed.
    f.) Check if rewrites checked in b. still exist.

  6. Product rewrites without categories path and not for disabled products and categories.
    a.) In URL Rewrite Management check if there are rewrites for products with categories path, or in DB check if in table core_url_rewrite are rows with set columns category_id and product_id.
    b.) In URL Rewrite Management check if there are rewrites for disabled products and categories.
    c.) Set field product_use_categories in configuration to „no”
    d.) Set field create_url_for_disabled in configuration to „no”.
    e.) In Index Management reindex „Catalog URL Rewrites”
    f.) Check if rewrites checked in a. and b. were removed.

Questions or comments

The time it takes for a full reindex on shop with 16802 products and 542 categories:

1. At present (before my changes):

First execute (when empty table core_url_rewrite):
1 – 56s
2 – 54s
3 – 55s
4 – 57s
5 – 55s

Another execute:
1 – 1m 4s
2 – 1m 1s
3 – 1m 3s
4 – 1m 3s
5 – 1m 1s

2. After changes

Configuration:

a.) product_use_categories = true and create_url_for_disabled = true (1. screenplay)

65614 - rewrites

First execute:
1 – 59s
2 – 56s
3 – 56s
4 – 57s
5 – 57s

Another execute:
1 – 1m 6s
2 – 1m 8s
3 – 1m 5s
4 – 1m 5s
5 – 1m 4s

b.) product_use_categories = true and create_url_for_disabled = false (2. screenplay)

47360 – rewrites

First execute:
1 – 50s
2 – 42s
3 – 40s
4 – 45s
5 – 43s

Another execute:
1 – 48s
2 – 50s
3 – 47s
4 – 51s
5 – 52s

From configuration a to b
1 – 47s
2 – 46s
3 – 48s
4 – 46s
5 – 46s

c.) product_use_categories = false and create_url_for_disabled = true (5. screenplay)

17342 - rewrites

First execute:
1 – 15s
2 – 15s
3 – 15s
4 – 15s
5 – 15s

Another execute:
1 – 17s
2 – 17s
3 – 17s
4 – 17s
5 – 17s

From configuration a. to c.
1 – 22s
2 – 21s
3 – 22s
4 – 21s
5 – 21s

d.) product_use_categories = false and create_url_for_disabled = false (6. screenplay)

14715 - rewrites

First execute:
1 – 13s
2 – 13s
3 – 13s
4 – 13s
5 – 13s

Another execute:
1 – 14s
2 – 14s
3 – 14s
4 – 14s
5 – 14s

From configuration a. to d.
1 – 18s
2 – 17s
3 – 17s
4 – 17s
5 – 17s

DB select to find rewrites to remove (app/code/core/Mage/Catalog/Model/Resource/Url.php::clearRewrites):

1. Configuration a.

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
LEFT JOIN `catalog_category_product` AS `tcp` ON tur.category_id = tcp.category_id AND tur.product_id = tcp.product_id
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
**// product rewrites with categories path and not connected with this category**
AND (tur.category_id IS NOT NULL) AND (tur.product_id IS NOT NULL) AND (tcp.category_id IS NULL) 
GROUP BY `url_rewrite_id`

2. Configuration b.

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
LEFT JOIN `catalog_product_entity_int` AS `cpei` ON cpei.entity_id = tur.product_id AND cpei.attribute_id = :product_status_attribute_id
LEFT JOIN `catalog_category_entity_int` AS `ccei1` ON ccei1.attribute_id = :is_active_category_attribute_id AND ccei1.store_id = 0 AND ccei1.entity_id = tur.category_id
LEFT JOIN `catalog_category_entity_int` AS `ccei2` ON ccei2.attribute_id = :is_active_category_attribute_id AND ccei2.store_id = :store_id AND ccei2.entity_id = tur.category_id
LEFT JOIN `catalog_category_product` AS `tcp` ON tcp.category_id = tur.category_id AND tcp.product_id = tur.product_id
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
AND (
	(
	**// rewrites for disabled products**
	cpei.value = 2
	**// rewrites for categories disabled in this store or in general store**
	 OR (IF(ccei2.value_id > 0, ccei2.value, ccei1.value) = 0) 
	**// product rewrites with categories path and not connected with this category**
	 OR (tur.category_id IS NOT NULL AND tur.product_id IS NOT NULL AND tcp.category_id IS NULL)
	)
)
GROUP BY `url_rewrite_id`

3. Configuration c.
db select to remove rewrites:

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
**// product rewrites with categories path** 
AND (tur.category_id IS NOT NULL) AND (tur.product_id IS NOT NULL) 
GROUP BY `url_rewrite_id`

4. Configuration d.

SELECT `tur`.`url_rewrite_id` FROM `core_url_rewrite` AS `tur`
LEFT JOIN `catalog_product_entity_int` AS `cpei` ON cpei.entity_id = tur.product_id AND cpei.attribute_id = :product_status_attribute_id
LEFT JOIN `catalog_category_entity_int` AS `ccei1` ON ccei1.attribute_id = :is_active_category_attribute_id AND ccei1.store_id = 0 AND ccei1.entity_id = tur.category_id
LEFT JOIN `catalog_category_entity_int` AS `ccei2` ON ccei2.attribute_id = :is_active_category_attribute_id AND ccei2.store_id = :store_id AND ccei2.entity_id = tur.category_id 
WHERE (tur.store_id = :store_id) AND (tur.is_system = 1) 
AND (
	(
	**// rewrites for disabled products**
	cpei.value = 2 
	**// rewrites for categories disabled in this store or in general store**
	OR (IF(ccei2.value_id > 0, ccei2.value, ccei1.value) = 0)
	**// product rewrites with categories path** 
	OR (tur.category_id IS NOT NULL AND tur.product_id IS NOT NULL) 
	)
) 
GROUP BY `url_rewrite_id`

Contribution checklist (*)

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All automated tests passed successfully (all builds are green)

@github-actions github-actions bot added the Component: Catalog Relates to Mage_Catalog label Jan 19, 2021
@colinmollenhour
Copy link
Member

Excellent work @MarcinNowakMacopedia, thanks for contributing!

However, that is a lot of complex code changes to review with not many notes as to what bugs it fixes, what behaviors possibly changed, etc..

I did see the commit messages but a detailed description of the bugs, how to reproduce them, why your PR fixes the problems, what testing you've performed to ensure there are no regressions and some of the raw SQL before and after would really help us to review this and get it accepted much more readily.

@MarcinNowakMacopedia MarcinNowakMacopedia force-pushed the fix-issues-with-product-category-rewrites branch from 9163a4d to 119233b Compare January 21, 2021 12:32
@tmotyl tmotyl changed the title Fix issues with product category rewrites Improve performance of product category rewrites (disabled product & categories and product_use_categories) Jan 22, 2021
@tmotyl tmotyl added the performance Performance related label Jan 22, 2021
@tmotyl
Copy link
Contributor

tmotyl commented Jan 22, 2021

@colinmollenhour should be better now :)

@rvelhote
Copy link
Contributor

We've been testing this PR in our development environment and the results are quite promising. We didn't see any issues arising from it and plan on deploying this into production soon. Thank you!

The most significant gains come from fixing the bug related to the catalog/seo/product_use_categories option. If people are using canonical urls for products then it's not really doing anything in terms of SEO (please correct me if I'm wrong). It's nice to have the product's URL matching the hierarchy during navigation but it's a trade-off we can accept to improve performance.

We have 99 store views with ~900 categories and ~20000 products. A single store view used to take us as much as 120 seconds to refresh and with this patch, ignoring disabled products/categories as well as disabling category-product urls, it can take between 15 to 25 seconds per store. This means we can reindex our entire store in 30 to 40 minutes.

@rvelhote
Copy link
Contributor

It took quite some time but we finally started using this patch recently. We didn't make a full reindex yet but everything is working as it should for newer and updated products and categories.

In general I would propose this PR to be merged into the mainline OpenMage LTS.

The only change I would recommend, thinking of a wider audience, is to display a message, when switching product_use_categories from No to Yes, that reindexing URL Rewrites is recommended. I was even thinking that the original intention of still generating URLs was for a quicker transition and not actually an oversight (arguable but possible).

@rvelhote
Copy link
Contributor

Reporting back with an issue we found using this PR. When the option to remove URLs of disabled products and categories is enabled, if you have a bunch of products that were disabled and enable them via a mass action the URLs will not be generated.

@MarcinNowakMacopedia MarcinNowakMacopedia force-pushed the fix-issues-with-product-category-rewrites branch from 0731d2e to 7754468 Compare April 27, 2022 19:46
@MarcinNowakMacopedia
Copy link
Contributor Author

The only change I would recommend, thinking of a wider audience, is to display a message, when switching product_use_categories from No to Yes, that reindexing URL Rewrites is recommended. I was even thinking that the original intention of still generating URLs was for a quicker transition and not actually an oversight (arguable but possible).

Added

@MarcinNowakMacopedia
Copy link
Contributor Author

Reporting back with an issue we found using this PR. When the option to remove URLs of disabled products and categories is enabled, if you have a bunch of products that were disabled and enable them via a mass action the URLs will not be generated.

Mage_Catalog_Model_Indexer_Url didn't support mass action events. I have added this. Please check if it works now.

Sorry for my late answer. I hope it is still helpful.

@MarcinNowakMacopedia MarcinNowakMacopedia force-pushed the fix-issues-with-product-category-rewrites branch from 6d9fff2 to 2478e00 Compare April 28, 2022 15:12
@tmotyl tmotyl force-pushed the fix-issues-with-product-category-rewrites branch from 2478e00 to 985cea1 Compare June 6, 2022 10:56
@tmotyl
Copy link
Contributor

tmotyl commented Jun 6, 2022

rebased

@tmotyl tmotyl force-pushed the fix-issues-with-product-category-rewrites branch 2 times, most recently from 5ab75ea to 8341d2c Compare June 21, 2022 06:34
@tmotyl
Copy link
Contributor

tmotyl commented Jun 21, 2022

fixed issues reported by phpstan

fballiano
fballiano previously approved these changes Aug 3, 2022
@colinmollenhour
Copy link
Member

Anyone want to test this on their staging environments and report back? I currently don't have a production-size instance to test with.

@rvelhote
Copy link
Contributor

@colinmollenhour We've been using this PR for almost a year minus the changes that happened after my last message in October 2021. Even without the most recent changes that solve a couple of problems I reported, we have not encountered any breaking issues in our shop.

Currently, we only disabled Use Categories Path for Product URLs and kept generating the rewrites for disabled products/categories (due to the issue I previously reported with mass actions). I will create a task for my team to track the latest changes and report back ASAP.

We have 101 store views, ~32000 products and ~2500 categories (enabled and disabled). This has been an excellent patch for us.

Copy link
Member

@colinmollenhour colinmollenhour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcinNowakMacopedia Can you add a backend model for the new config field so that when it is changed the index is automatically marked dirty? Then the comments like 'Changing this option requires full reindex of "catalog_url"' can be removed since the index will be marked as requiring a reindex and it will be obvious. See \Mage_Catalog_Model_System_Config_Backend_Catalog_Category_Flat as an example.

@MarcinNowakMacopedia
Copy link
Contributor Author

@colinmollenhour I added Mage_Adminhtml_Model_System_Config_Backend_Seo to change the index status of categories and products after changing create_url_for_disabled in configuration. Additionally, I modified Mage_Adminhtml_Model_System_Config_Backend_Seo_Product to change product indexer status, after changing product_use_categories.
e33773c

@fballiano
Copy link
Contributor

@MarcinNowakMacopedia @tmotyl could you please check the conflicts?

@MarcinNowakMacopedia MarcinNowakMacopedia force-pushed the fix-issues-with-product-category-rewrites branch from 11755bc to 7efdb28 Compare September 27, 2022 11:38
when catalog/seo/product_use_categories in config has value false
…roduct and category.

Add in config field create_url_for_disabled. When it is set as false dont generate redirects for disabled products and categories, what improves performance. Default is true, thanks to it it is compatible backwards.
to change category flas and product flat indexers status to require_reindex when configuration changed
Change the way of getting the product entityTypeId.

Co-authored-by: Colin Mollenhour <[email protected]>
Copy link
Member

@colinmollenhour colinmollenhour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config values can be set at the store scope but by caching them in the constructor they will not respect the store scopes. I pushed a PR for your PR at macopedia#8

@sreichel sreichel added the review needed Problem should be verified label Jan 19, 2023
@sreichel
Copy link
Contributor

sreichel commented Jan 19, 2023

@colinmollenhour thanks. lgtm but waiting for more tests.

@colinmollenhour
Copy link
Member

@MarcinNowakMacopedia Are you able to look at my PR macopedia#8? I am not able to push directly to your branch.

@fballiano
Copy link
Contributor

I'll close and reopen this PR since it's not "editable" by admins we can't fix anything in order to finish it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Adminhtml Relates to Mage_Adminhtml Component: Catalog Relates to Mage_Catalog performance Performance related review needed Problem should be verified
Projects
None yet
7 participants