How to properly loop through these external URLs to get them into the sitemap
-
I have filtered Urls that I would like to add to push into a sitemap. I am using one of the sitemap plugins, which has hooks to further modify it.
my code so far
// add to theme's functions.php add_filter('bwp_gxs_external_pages', 'bwp_gxs_external_pages'); function bwp_gxs_external_pages($pages) { return array( array('location' => home_url('www.example.com/used-cars/location/new-york/model/bmw'), 'lastmod' => '27/03/2017', 'frequency' => 'auto', 'priority' => '1.0'), array('location' => home_url('www.example.com/used-cars/location/los-angeles/model/aston-martin'), 'lastmod' => '27/03/2017', 'frequency' => 'auto', 'priority' => '0.8') array('location' => home_url('www.example.com/used-cars/model/mercedes-benz'), 'lastmod' => '27/03/2017', 'frequency' => 'auto', 'priority' => '0.8') ); }
So as you can see in my code that I have these kind of URLs
www.example.com/used-cars/location/new-york/model/bmw
&www.example.com/used-cars/model/mercedes-benz
So my issue is that, there are thousands of these URLs and I to push them all into this sitemap.
So my question is that, isn’t there a way to perhaps loop them over? than to insert them into the code one by one like so
array('location' => home_url('www.example.com/used-cars/model/aston-martin'), 'lastmod' => '27/03/2017', 'frequency' => 'auto', 'priority' => '0.8')
-
This topic was modified 8 years ago by
Nkululeko.
-
This topic was modified 8 years ago by
-
Most assuredly there is a glitch somewhere. The problem is we don’t know where. It’s odd that hide_empty being false causes an error, unless some other code is assuming each term has a post and an error arises from that bad assumption.
So only terms from vehicle posts. What other terms would there be? The terms with no posts of course are suppressed by hide_empty being true. Taxonomy terms for vehicle model and vehicle location sound like they would only apply to vehicles.
I’m not sure how knowing the answer will help. Your get_terms() code looks good, there’s not much to go wrong there, yet something is wrong. All I can really suggest is setting up a process of elimination as I described at the end of a previous post.
I see. the reason why I thought the problem could be the plugin itself, was after reading the plugins FAQ page
I’ve once modified my custom post type title to get taxonomy term names from the
term_ID
before on this very same website I’m working on, and it functions perfectly. This is the code I used rightadd_filter(‘wpseo_title’, ‘vehicle_listing_title’);
function vehicle_listing_title( $title ) { if ( is_post_type_archive('used-cars') && $id = get_queried_object_id() ) { $locations = get_query_var('location'); $location = get_term_by('slug', $locations, 'vehicle_location'); $models = get_query_var('model'); $model = get_term_by('slug', $models, 'vehicle_model'); $title = ''; if($model && $model) $title .= $model->name . ' used'; else $title .= 'Used'; $title .= ' cars for sale'; if($location && $location) $title .= ' in ' . $location->name; $title .= ' on ' . get_bloginfo('name'); return $title; } return $title; }
Basically this code gets the term name from the same object that we trying to get the slug from, and it works perfectly fine. Do you still think the problem could still be from the theme?
The one thing I haven’t tried is “To narrow down the conflict source, temporarily copy your get_terms() calls to near the bottom of your theme’s page.php template. Add code to var_dump the assigned variables.” I’ll get on it right now
Its really confusing. I don’t know if this is going to change anything, but I’m considering trying using yoast to generate this sitemap, of which was initially my first choice. I’m just not sure of this code, please check this example out
function add_sitemap_custom_items(){ $sitemap_custom_items = '<sitemap> <loc>https://www.website.com/custom-page-1/</loc> <lastmod>2012-12-18T23:12:27+00:00</lastmod> </sitemap> <sitemap> <loc>https://www.website.com/custom-page-2/</loc> <lastmod>2012-12-18T23:12:27+00:00</lastmod> </sitemap> <sitemap> <loc>https://www.website.com/custom-page-3/</loc> <lastmod>2012-12-18T23:12:27+00:00</lastmod> </sitemap>'; return $sitemap_custom_items; } add_filter( 'wpseo_sitemap_index', 'add_sitemap_custom_items' );
Do you think I could use this function to achieve what I’m trying to do? It seems like its used to add an external sitemap rather than external pages, I’m not sure, what do you think?
One more thing, There is another error I might have missed that showed up after I turned on WP_DEBUG, I think this could be causing the glitch.
“Warning: include(C:\wamp\www\autocity/wp-content/advanced-cache.php) [function.include]: failed to open stream: No such file or directory in C:\wamp\www\autocity\wp-settings.php on line 84”
Call Stack # Time Memory Function Location 1 0.0034 368568 {main}( ) ..\index.php:0 2 0.0173 372192 require( 'C:\wamp\www\autocity\wp-blog-header.php' ) ..\index.php:17 3 0.0189 397648 require_once( 'C:\wamp\www\autocity\wp-load.php' ) ..\wp-blog-header.php:13 4 0.0208 411816 require_once( 'C:\wamp\www\autocity\wp-config.php' ) ..\wp-load.php:37 5 0.0234 580448 require_once( 'C:\wamp\www\autocity\wp-settings.php' ) ..\wp-config.php:92
and then there is…
“Warning: include() [function.include]: Failed opening ‘C:\wamp\www\autocity/wp-content/advanced-cache.php’ for inclusion (include_path=’.;C:\php\pear’) in C:\wamp\www\autocity\wp-settings.php on line 84”
Call Stack # Time Memory Function Location 1 0.0034 368568 {main}( ) ..\index.php:0 2 0.0173 372192 require( 'C:\wamp\www\autocity\wp-blog-header.php' ) ..\index.php:17 3 0.0189 397648 require_once( 'C:\wamp\www\autocity\wp-load.php' ) ..\wp-blog-header.php:13 4 0.0208 411816 require_once( 'C:\wamp\www\autocity\wp-config.php' ) ..\wp-load.php:37 5 0.0234 580448 require_once( 'C:\wamp\www\autocity\wp-settings.php' ) ..\wp-config.php:92
The problem is, I have no idea what this is.
Those errors are from trying to access a file used by a caching plugin. There’s no reason to cache on localhost (except to test the caching plugin), it probably came from a file on a production server? You can safely ignore the errors, or speed things up a bit by commenting out the offending lines in wp-settings.php. Just don’t leave them commented out if you copy back to production! (there would be no reason to copy back)
In any case, it would not affect our problem. The get_term_by() function creates it’s own query and is not filterable, so is fairly reliable. However, you need to know what you’re looking for in order to use it. get_terms() is capable of getting everything without you needing to know exactly what that is.
I like the Yoast filter a lot more than the the bwp_gxs one. There’s no strange blackbox post processing. What you return ends up on the sitemap, full stop. That does mean we have to get the return exactly right, there’s no margin for error. The Yoast filter can be used for any kind of sitemap content. Whatever is returned is appended directly to the sitemap. I’m not sure what you mean by external sitemap vs. external pages. It may not matter.
The thing is though, you still need to run some sort of loop based on get_terms() in order to generate the content. On top of that, it’s unclear to me if Yoast will break up a huge list into discrete parts.
I suppose something to try is to copy your bwp_gxs filter function internals into the the example Yoast filter code, replacing all the current internals. The get_terms() and nested loops are all the same. You should probably keep hide_empty as true for now. What’s different in generating the data is, instead of adding to an array, we concatenate strings to generate one very long string.
Aside from the get_terms() part, the rest should be like this:
// Loop through the search terms $pages = ''; foreach ( $models as $model ) { $location2 = home_url( '/used-cars/model/' . $models->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; foreach ( $locations as $location ) { $location2 = home_url( '/used-cars/location/' . $location->slug . '/model/' . $model->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; } } return $pages;
Confirm the <changefreq> term (weekly) is appropriate, “auto” is not appropriate in this context. Valid terms are: always, hourly, daily, weekly, monthly, yearly, or never. When the dust has settled, if this method is decided upon, ask me about auto updating the <lastmod> date term.
Now comment out your bwp_gxs add_filter line, and deactivate that plugin. Configure Yoast to generate sitemaps and see what happens. If get_terms() are working correctly, there should be URLs for all terms assigned to any posts. If that looks good, change hide_empty to false and check the sitemap again.
If you encounter any problems, we are back to where you need to deactivate everything except Yoast to isolate the problem source. Also post your full Yoast sitemap filter code here. I’d like to double check all is good with that. If all this checks out, the only possibilities for the source of get_terms() problems would be either Yoast (unlikely) or your theme.
If it’s your theme, it isn’t too dire. If you can find where it’s hooked into get_terms(), the hook can be removed from a child theme or custom plugin and replaced with a corrected version that doesn’t impact other uses of get_terms().
Oh I see, it makes sense now.
Ah yes, I hope it works out, It was my ideal choice anyway. I think it does break entries up, there is a section where you can set a limit for entries in a sitemap, So I think that won’t be an issue.
So I tried the code as you said, I could be doing something wrong, it returned an error, this is what it showed
Parse error: syntax error, unexpected $end in C:\wamp\www\autocity\wp-content\themes\autocity\functions.php on line 185
add_filter( 'wpseo_sitemap_index', 'add_sitemap_custom_items' ); function add_sitemap_custom_items(){ { $models = get_terms( array( 'taxonomy' => 'vehicle_model', 'hide_empty' => true, ) ); $locations = get_terms( array( 'taxonomy' => 'vehicle_location', 'hide_empty' => true, ) ); // Loop through the search terms $pages = ''; foreach ( $models as $model ) { $location2 = home_url( '/used-cars/model/' . $models->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; foreach ( $locations as $location ) { $location2 = home_url( '/used-cars/location/' . $location->slug . '/model/' . $model->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; } } return $pages; <----- this is line 185
I’ve shown line 185 in the code, also I see on the line after the foreach() statements you wrote the variables as singular
e.g
$location2 = home_url( '/used-cars/location/' . $location->slug . '/model/' . $model->slug );
Is this is a typo, or thats how it should be?
The syntax error appears to be in two places. There are two opening braces at the beginning of the function and apparently no closing ones at the end. (unless it’s a copy/paste error) They need to match up.
function add_sitemap_custom_items() { $models = get_terms( array( // more code... return $pages; }
If by variables you mean $location and $model, then yes, that is correct. These variables are to match up with whatever name was established in the corresponding foreach after the ‘as’ part. I could have done
foreach ($models as $goblygooks)
then used$goblygooks->slug
for all it matters. As for the ‘slug’ property, that is correct. For the parts in quotes, I copied them for your original post, if they are incorrect then please change accordingly. The correct string is whatever works on your site.Right, I got it
Yeah, I meant the $location and $model part, I believe everything else is the way it should be.
So I tried the code, and it did not return anything when I generated the sitemap. Please double check the final code, maybe you might pickup on a glitch somewhere
add_filter( 'wpseo_sitemap_index', 'add_sitemap_custom_items' ); function add_sitemap_custom_items(){ $models = get_terms( array( 'taxonomy' => 'vehicle_model', 'hide_empty' => true, ) ); $locations = get_terms( array( 'taxonomy' => 'vehicle_location', 'hide_empty' => true, ) ); // Loop through the search terms $pages = ''; foreach ( $models as $model ) { $location2 = home_url( '/used-cars/model/' . $models->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; foreach ( $locations as $location ) { $location2 = home_url( '/used-cars/location/' . $location->slug . '/model/' . $model->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; } } return $pages; }
One glitch in this line:
$location2 = home_url( '/used-cars/model/' . $models->slug );
Must be
$model->slug
, singular form. This causes a mere warning, so you still would get partial output.Other than that, the XML output looks good, so it appears there’s still a problem with get_terms() working correctly. See if you can find any sort of hook to “get_terms” within your theme or the plugins you cannot deactivate.
I’ve got to run right now, I’ll see if I can think up any other debugging or work around technique later.
Here’s a possible workaround to the get_terms() problem. Add this function above the add_filter() call:
function alt_get_terms( $tax ) { $hide_empty = true; $term_query = new WP_Term_Query(['taxonomy'=> $tax, 'hide_empty'=> $hide_empty,]); return $term_query->get_terms(); }
In the add sitemap items function, make everything between the function line and the Loop comment into a comment, i.e. wrap the $models and $locations assignment code with block comment delimiters: /* [code] */
Add the following between the new block comment and the Loop comment:
remove_all_filters('get_terms_args'); remove_all_filters('terms_clauses'); $models = alt_get_terms('vehicle_model'); $locations = alt_get_terms('vehicle_location');
If that does the trick, make the remove_all_filters() into comments. Test again. If all appears well still, in the alt_get_terms() function, change the
true
in the $hide_empty line tofalse
. If all is still working at this point, I think we're done!What this all does is make a simplified, app specific version of get_terms() that bypasses some possible filtered influence from other code. There are still a couple filters that could be causing problems, so I've initially removed all potential filters. Doing this could cause problems elsewhere, so it'd be better to not do that, which is why I've asked you to comment them out once we can see this workaround works.
Finally, we hide empty terms once we are sure everything else is working properly.
Alright, I see where you going with it.
I did as you said, but then there seems to be an error on this line of the function
$term_query = new WP_Term_Query(['taxonomy'=> $tax, 'hide_empty'=> $hide_empty,]);
This is what I got
Parse error: syntax error, unexpected '[', expecting ')' in C:\wamp\www\autocity\wp-content\themes\autocity\functions.php on line 165
I tried to remove the ‘[‘ but it lead me to even more errors.
and also I been meaning to ask you, a friend sugested that we try ‘get_queried_object’ function as a parameter in ‘get_the_terms’
e.g obj= get_queried_object() $models= get_the_terms( $obj, 'vehicle_model')
Could something like this work?
Ugh, I guess you have an older version (pre 5.4) of PHP where the
[{array_items}]
‘short’ array syntax is not valid. You should consider updating if you can, the latest stable version is 7.1.4. They jumped from 5.6 to 7.0, so it’s not as out of date as it might sound.Still, I shouldn’t have assumed short syntax support. Lots of people are still pre 5.4 like you and WP only requires 5.2.4 or better. I like the short syntax because it helps me keep nested parenthesis in order. Here’s the proper, pre 5.4 syntax:
$term_query = new WP_Term_Query( array('taxonomy'=> $tax, 'hide_empty'=> $hide_empty,));
I love the get_queried_object() function! I don’t see how it will help us here though. get_the_termms() returns the terms assigned to a specific post object. Not only is that not what we want, but the queried object is not always a single post. In this context I’m not sure there really is a queried object, or if there is, it keeps changing as the sitemap output progresses. If anything, it’s probably whatever collection is output just before your custom terms, not a post object.
In other words, the queried object is a good thought but it won’t work for what we are trying to do. Alternative ideas always welcome though!
Ah I get what you mean, lets stick to your solution and see where it takes us.
I’ll make sure I update my PHP. Would I still need to change it back to
[{array_items}]
after I updated it?So the error is fixed, but now the code seems to not have any effect, there is no changes in the sitemapindex, please double check the code again for me to see if everything is it should be.
function alt_get_terms( $tax ) { $hide_empty = true; $term_query = new WP_Term_Query( array('taxonomy'=> $tax, 'hide_empty'=> $hide_empty,)); return $term_query->get_terms(); } add_filter( 'wpseo_sitemap_index', 'add_sitemap_custom_items' ); function add_sitemap_custom_items(){ /* $models = get_terms( array( 'taxonomy' => 'vehicle_model', 'hide_empty' => true, ) ); $locations = get_terms( array( 'taxonomy' => 'vehicle_location', 'hide_empty' => true, ) ); */ remove_all_filters('get_terms_args'); remove_all_filters('terms_clauses'); $models = alt_get_terms('vehicle_model'); $locations = alt_get_terms('vehicle_location'); // Loop through the search terms $pages = ''; foreach ( $models as $model ) { $location2 = home_url( '/used-cars/model/' . $model->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; foreach ( $locations as $location ) { $location2 = home_url( '/used-cars/location/' . $location->slug . '/model/' . $model->slug ); $pages .= "<sitemap><loc>$location2</loc><lastmod>2017-04-04T23:59:00+00:00</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></sitemap>\n"; } } return $pages; }
Works perfectly on my site when I place my own taxonomies. One small observation, each URL ought to have a final trailing slash. At the end of each $location2 assignment, have it look like this:
->slug . '/');
URLs without work just fine, but WP does a redirect when they are missing. IMO sitemap URLs should not cause redirects.No need to go back to the short array syntax. The advantages of updating are much more far reaching than alternate array syntax support.
Are you sure the taxonomy names are correct as passed to alt_get_terms()? The permalink terms are just ‘model’ and ‘location’. These usually match the taxonomy name, though they can also be rewritten, so that doesn’t necessarily mean anything. When you go to the taxonomy list screen, (the one where you can add terms on the left and a list of current terms are on the right) in the address bar of your browser, what name appears after /wp-admin/edit-tags.php?taxonomy= for each taxonomy? Whatever that is is what should be passed to alt_get_terms().
If that checks out, I’m at a loss to explain why getting all taxonomy terms is so difficult. After all, the taxonomy list screen manages to get the terms just fine. It uses the WP_Term_Query class just like my alt_get_terms() function does.
I suppose the next step is write an SQL query to get the terms straight from the DB. That ought to work no matter what. At least as long as the correct arguments are used. Unfortunately, I’m not so good with SQL and the taxonomy-terms table relationship is convoluted. I’d be willing to give it a try if there seems no other alternative. Tell me this though, in both taxonomies, are they “flat”, as in non-hierarchical? Meaning there are no parent-child term relationships, or are there?
Geesh! I wonder why it isn’t working on my site. Alright I’ll add that last trailing slash.
I’m 100% sure of the taxonomy names. here is what it shows for location
https://localhost/autocity/wp-admin/edit-tags.php?taxonomy=vehicle_location&post_type=used-cars
and this for modelshttps://localhost/autocity/wp-admin/edit-tags.php?taxonomy=vehicle_model&post_type=used-cars
I once heard someone mention getting terms through SQL, But I have zero knowledge of how to work it.
Yeah, they are hierachical, Sorry about that, I should have mentioned. like for example on models there is…
Audi audi -Audi A4 audi-a4 -Audi A6 audi-a6 etc
… and then on locations there’s…
South Africa south-africa -Western Cape western-cape --Cape Town cape-town etc
Could this change things?
I figured as much that the slugs were correct. I had to ask anyway. A question of desperation. Hierarchy would not affect the get_terms() success or failure. I was asking in anticipation of writing an SQL solution.
I had a thought that maybe WP isn’t fully implemented since we are dealing with a request for sitemap.xml, not normal WP content. I’m not sure if that’s even possible, and anyway we should be seeing error messages like “Call to undefined function ‘get_terms’.” So that can’t be it. This has got to be the most perplexing thing I’ve ever run across! It feels like there’s something obvious standing in the way, but you can’t see it because you are not intimately familiar with WP and I can’t see it because I can only see what you see. Arrrrgh!
Give me some time again to investigate a little. I’ll try setting up Yoast to do sitemaps on my site and see if anything jumps out. If that doesn’t turn up anything, I’ll work up an SQL query to get these terms. I’m certainly no SQL wiz, and the three taxonomy tables are pretty intimidating since table joins are one of my weak points. Still, I do believe I can figure it out, though it might take a while. I’ll keep you updated on my progress.
- The topic ‘How to properly loop through these external URLs to get them into the sitemap’ is closed to new replies.