eerac
Forum Replies Created
Viewing 2 replies - 1 through 2 (of 2 total)
-
Forum: Fixing WordPress
In reply to: WXR File SplitterJust realized my link I posted above expires in a month. Here’s the actual python script:
#!/usr/bin/python # This script is designed to take a wordpress xml export file and split it into some # number of chunks (2 by default). The number of lines per chunk is determined by counting # the number of occurences of a particular line, '<item>\n' by default, and breaking up the # such that each chunk has an equal number occurences of that line. The appropriate header # and footer is added to each chunk. import os import sys import math if len(sys.argv) < 2 : print 'Please specify the name of wordpress export file you would like to split' sys.exit(0) try : input_file = open(sys.argv[1], 'r') lines = input_file.readlines() (input_file_path, input_file_string) = os.path.split(sys.argv[1]) (input_file_name, input_file_extension) = os.path.splitext(input_file_string) except IOError : print 'Could not open file "%s".' % sys.argv[1] sys.exit(0) number_of_chunks = max(int(sys.argv[2]), 2) if len(sys.argv) > 2 else 2 line_delimiter = '<item>\n' delimiter_count = 0 for line in lines : if line == line_delimiter : delimiter_count += 1 print '' print 'File "%s" contains %s items' % (input_file_string, delimiter_count) delimiter_count = 1.0*delimiter_count delimiters_per_chunk = int(math.ceil(delimiter_count/number_of_chunks)) print 'Creating %s files with at most %s items each:' % (number_of_chunks, delimiters_per_chunk) header = "" footer = "\n</channel>\n</rss>\n" chunk_number = 1 output_file_name = "%s_%s%s" % (input_file_name, chunk_number, input_file_extension) output_file = open(output_file_name, 'w') print ' Writing chunk %s to file %s...' % (chunk_number, output_file_name) delimiter_count = 0 for line in lines : if line == line_delimiter : delimiter_count += 1 if chunk_number is 1 and delimiter_count is 0 : header += line if delimiter_count > delimiters_per_chunk : output_file.write(footer) output_file.close() chunk_number += 1 delimiter_count = 1 output_file_name = "%s_%s%s" % (input_file_name, chunk_number, input_file_extension) output_file = open(output_file_name, 'w') print ' Writing chunk %s to file %s...' % (chunk_number, output_file_name) output_file.write(header) output_file.write(line) output_file.close() print 'Done!\n'
Forum: Fixing WordPress
In reply to: WXR File SplitterHere is some python code I wrote (so it should work just fine on a mac or linux):
https://wordpress.pastebin.ca/2004312
Just paste it into a text file (e.g. ‘splitter.py’). Go to the directory containing the newly created file. Make sure the file is executable. Then call
./splitter.py <name_of_your_wxr_file> <desired_number_of_slices>
In truth this code hasn’t been extensively tested, but I’ve used it a few times now on various wxr files and it seems to work (it’ll output a bunch of separate wxr files that you can then import separately)
Viewing 2 replies - 1 through 2 (of 2 total)