Attachments to FileUpload Importer

rayk · October 2010

With HUGE THANKS TO MY FRIEND we have began on the task of trying to re import the attachments for vanilla 1 into vanilla 2 FileUpload.

We have been successful (this works!), but the current way is not user friendly, so we need some one to help us with making it into a user friendly php script. That would be quite awesome.

Anyway here are the instructions (the code for ruby script at the end of the post!)

First run this query to insert your old attachments table data into FileUpload table.

INSERT INTO GDN_Media (Name, Type, Size, StorageMethod, Path, InsertUserID, ForeignID, ForeignTable) SELECT Name, MimeType, Size, 'local', Path, 1, CommentID, 'comment' from LUM_Attachment

After that you need to extract the necessary data from the GDN_Media table in CSV format.

Run an SQL query:

SELECT Path, MediaID FROM `GDN_Media`

When the result comes up, click the "Export" button at the bottom of the page and export as CSV

Save the file onto your server

You should have a list of paths to media ids, like this (I made a copy of my old attachments inside FileUpload):


"FileUpload/2010/07/filename01.jpg";"3626"

"FileUpload/2010/07/filename02.jpg";"3625"

Once that's done, run the ruby script:


ruby paths_to_mediaid_paths.rb <name of csv text file> <root path of source images> <root path of FileUpload folder, including FileUpload>

So for example:


ruby paths_to_mediaid_paths.rb path_to_mediaid.txt ./uploads ./uploads/FileUpload

If you forget a parameter it will tell you

To test without copying, add "lame" to the front:

ruby lame paths_to_mediaid_paths.rb path_to_mediaid.txt ./uploads ./uploads/FileUpload

There's a final parameter to the script, which is the "dispersion factor"

That value is saved in the FileUpload plugin's option "Plugin.FileUpload.DispersionFactor"

Here is the ruby script, name the file paths_to_mediaid_paths.rb

require 'fileutils'

lame = if ARGV[0] == 'lame'
    ARGV.shift
    puts "Lame operation, no changes will be made."
    true
else
    false
end

raise "No input CSV path given." if !ARGV[0]
input_path = ARGV[0]

raise "No source images base path given." if !ARGV[1]
source_images_path = ARGV[1]

raise "No dest images base path given." if !ARGV[2]
dest_images_path = ARGV[2]

dispersion_factor = if !ARGV[3]
    puts "No dispersion factor given, using default of 20, please specify Plugin.FileUpload.DispersionFactor if different from default."
    20
else
    ARGV[3].to_i
end

puts "Reading from #{ARGV[0]}..."

path_to_id = {}

File.new(input_path, 'r').each_line do |line|
    result = /^\"(.+?)\";\"(\d+?)\"/.match(line)

    if !result
        puts "No match for line '#{line.chomp}', skipping."
        next
    end

    path_to_id[result[1]] = result[2].to_i
end

puts "#{path_to_id.length} records read."

path_to_id.each do |path, media_id|
    source_path = File.join(source_images_path, path)    
    source_extension = File.extname(source_path)
    dispersion_id = media_id % dispersion_factor
    dest_dir = File.join(dest_images_path, dispersion_id.to_s)
    dest_path = File.join(dest_dir, "#{media_id}#{source_extension}")

    puts "Copying '#{source_path}' -> '#{dest_path}'"

    begin
        FileUtils.mkdir_p(dest_dir) if !lame
        FileUtils.cp(source_path, dest_path) if !lame
    rescue Exception=>e
        puts "  Error while copying: #{e}"
    end
end

Desmanthus · October 2010

Thanks so much for working on this very important project!

Please keep us updated.

rayk · October 2010

The sql command has been updated to properly place the attachments in their relevant comment.

Still figuring a way out with renaming files.

rayk · October 2010

Moved all of this to first post!

It is all working, but it needs some one to repackage it into a friendlier php script.

Maybe @Tim can help us get this into a php script

Desmanthus · October 2010

I'm excited by your breakthrough, but not crafty enough to implement it. Eagerly awaiting a script.

vegano · October 2010

same with me.
I'am thinking about runing that local - because me webspace doesn't suport ruby, but local I could run the script.

Would be so great to have this as PHP script

Desmanthus · October 2010

Anyone able to help turn this in a php script?

Linc · October 2010

I might be willing to give it a shot if he'd thoroughly comment what's happening in the script. I can't read Ruby code. PS: Why would you make a Ruby script to migrate between PHP-based products? 0.o

vegano · October 2010

so, translated nearly everything into php and put in crowds of comments

please make the file better, because I only know bits of php and ruby, so I was only able to translate, but not to make real good code.
There is one line of Ruby left in the very end of the file - don't know how to handle errors

@Lincoln - are your able to do the rest?

Would be great to also have the SQL things done by the script and get an html output with a form.


that isn't finished or tested yet!

<?php

/* Someone should write an GUI for that.  
 * a form with files:
 * csv_path, path to attachments-files, path to fileUpload-folder, checkbox wether it's a testrun or files should be copied, dispersion-factor
 *
 * What does this script?
 * 1) Takes the variables form the form or of the config-section
 * 2) Read in the files out of the CSV-file
 * 3) copy the files from "attachments" to "fileUpload" folder
 */

/* Config section
 * fill in your paths and so on
 */
// if true, only testrun, dont copy files
$testrun = false; 
// the path to your CSV-file
$csv_path = "";
// path to your old attachments folder
$from_attachments_folder_path = "";
// path to your new fileUpload folder
$to_fileUpload_folder_path = ""; 
// dispersion-factor spread for files to many folders - by default 20, INTEGER
$dispersion_factor = NULL ; // integer


/* Checking the input variables
*/

// Check for testrun
if ($testrun == true){
echo "Lame operation, no changes will be made.";
$lame = true;
}

// check for csv_path
// may there be a much shorter way to write that!
// may should check if file exists
if (!isset($csv_path)){
print "No input CSV path given.";
break; // End script here with error message
} else{
$input_path = $csv_path;
}

// check for attachments-folder
// may there be a much shorter way to write that!
// may should check if folder exists
if (!isset($from_attachments_folder_path)){
print "No source images base path given.";
break; // End script here with error message
} else{
$source_images_path = $from_attachments_folder_path;
}

// check for attachments-folder
// may there be a much shorter way to write that!
// may should check if folder exists
if (!isset($to_fileUpload_folder_path)){
print "No dest images base path given.";
break; // End script here with error message
} else{
$dest_images_path = $to_fileUpload_folder_path;
}

// check for dispersion-factor - must be integer
if (!isset($dispersion_factor)){
$dispersion_factor == 20;
print "No dispersion factor given, using default of 20, please specify Plugin.FileUpload.DispersionFactor if different from default.";
//Don't break the script
}

// status message
echo "Reading form". $from_attachments_folder_path;


$path_to_id = array() // new array

// open CSV-file
$handle = fopen ($input_path, "r");
// read file
while (!feof($handle)) {
    $line = fgets($handle);
	// get infos - example for line: "FileUpload/2010/07/filename01.jpg";"3626"
	preg_match("/^\"(.+?)\";\"(\d+?)\"/", $line, $result);

	// if line is empty - error message and next line
	if (!isset($result) {
		echo "No match for line" . $line . ",skipping";
		continue;
		}
	//key is path to file, value is media_id
	$path_to_id[result[1]] = result[2];
} // End read file

// Count records -> status message
echo count($path_to_id) . "records read.";

foreach ($path_to_id as $path => $media_id){
$source_path = $source_image_path ."/". $path; //get path to source file
$path_parts = pathinfo($source_path); // get extetion of the file
$source_extention = $path_parts['extension'];
$dispersion_id = $media_id % $dispersion_factor; // get dispersion_id
$dest_dir = $dest_images_path . "/" . $dispersion_id; // get new folder + dispersion
$dest_path = $dest_dir. "/" . $media_id . $source_extention; //get new file path

// status message
echo "Copying " . $source_path ." -> " . $dest_path; 

// Copying
	if ($lame != true){
	mkdir($dest_dir);
	copy($source_path, $dest_path);
	// handle errors!
	rescue Exception=>e  // ruby-code
	echo "Error while copying: #{e}"
	}
}

// status message
echo "all done, all fine - have fun with fileUpload";

?>

vegano · October 2010

have it also her - for better editing:

http://gist.github.com/626240

vegano · October 2010

for automatic generation von the CSV file my that's a good start:
http://snipplr.com/view/2234/export-mysql-query-results-to-csv/
hm, maybe also helpful:
http://snipplr.com/view/42263/export-csv/
and:
http://snipplr.com/view/40637/create-csv-from-mysql/
and:
http://www.tutorial5.com/content/view/159/85/

hm - might phpmyadmin is easier than write a new script...

vegano · October 2010

so, now it has an form to put in your paths

still untested - but PHP should work + some error handling features

http://gist.github.com/629919

now I'm trying to style it a little more like the VanillaImpoter script

vegano · October 2010

so - now the script has a form to fill in your paths. The SQL stuff has to be done by hand
It's untested yet - I plan to test it tomorrow on a local installation.

Would be great if someone could write the SQL part, so people only have to but in their SQL logins and all the stuff one have to do manuall is done by the script.

The CSV file could be wrote in the same folder that the script is - so it would be even simpler to work with it.

>> http://gist.github.com/629919

Desmanthus · October 2010

@vegano Keep it up!

vegano · October 2010

enoght for today.

What works:
- the SQL things work manually
- copying files from attachments to fileUpload
- links in the forum are linked with the files (seems like, if there have been files in database befor you put in your old attachment files, that leeds to trouble with the links)

What doesn't work yet:
/* - seems like files of the first discussion aren't shown?
SHIT - seems like fileUpload saves files of the first post as a attachment of the discussion, not of the comment -> seems like work, but could be done per php
@rayk do you have attachment on your first post?! */
EDIT: found a way to fix it - two complex SQL requests and everything should be fine

To Do:
- make rules, how the paths have to look like (with or without "/" in the end), if people do it wrong, correct it.
- fix a bug -> absolute path/relative path problem
- think about how to do the SQL things within the script

Wanna help?
- Think about how to manage the SQL things - would be great if people don't have to have vanilla 1 and vanilla 2 in the same database
- Suggest me some good songs per Message
- look at the code and suggest ways to make things quicker or in a better way.
- think about some jquery magic to make things more usable

rayk · October 2010

Awesome stuff

. Yes the uploads in the first post of discussions didn't carry across for me - but I only had a couple of threads where that was an issue that could be done manually.

vegano · October 2010

yeah - figuered it out, how to fix the problem with the attachments of the first posts - need two SQL queries:

INSERT INTO GDN_Media (Name, Type, Size, StorageMethod, Path, InsertUserID, DateInserted, ForeignID, ForeignTable) SELECT Name, MimeType, Size, 'local', Path, UserID, DateCreated, CommentID, 'comment' from LUM_Attachment WHERE lum_attachment.AttachmentID NOT IN(SELECT AttachmentID FROM lum_attachment JOIN forum.lum_discussion ON lum_attachment.DiscussionID = lum_discussion.DiscussionID WHERE lum_attachment.CommentID = lum_discussion.FirstCommentID)

INSERT INTO GDN_Media (Name, Type, Size, StorageMethod, Path, InsertUserID, DateInserted, ForeignID, ForeignTable) SELECT Name, MimeType, Size, 'local', Path, UserID, DateCreated, DiscussionID, 'discussion', DateCreated from LUM_Attachment WHERE lum_attachment.AttachmentID IN(SELECT AttachmentID FROM lum_attachment JOIN lum_discussion ON lum_attachment.DiscussionID = lum_discussion.DiscussionID WHERE lum_attachment.CommentID = lum_discussion.FirstCommentID)

may could be done quicker, but seems to work for me.

vegano · October 2010

today I have started give the script a new and more usable layout + prepared it to be ready for input, to handle the SQL stuff.

I have copied the structur and CSS from the vanilla exporter, but costumized it to fit the script. And prepared some jquery stuff for usability

tomorrow I will try to get the SQL stuff working - so nothing left to be done by hand.

Don't use, but if you want to, you can have a look
http://gist.github.com/629919#file_attachments2file_upload_0_1.php

vegano · October 2010

first step is made.

If you have the databases of Vanilla1 and Vanilla2 an the same host, you now can export you Attachment database entries to FileUpload.

Copy that file and run it from on your server:
http://gist.github.com/629919#file_attachments2file_upload_0_1.php

Through the development process my data will be set as default - because tip it in over and over again sucks. Will be cleaned up for the first real release.

If your Vanilla 1 and Vanilla 2 databases are on different servers - I think the script won't work yet - had no chance to test and that has a low priority for me.

Please test the script, look at the code and tell me what should be done in different way.

Next step will be copying the files.

If your in a hurry, you can export the CSV file in the old way and run the first script

to be found: http://gist.github.com/629919#file_attachments_to_file_upload.php

Tomorrow I will try to port the file stuff into the script with the SQL stuff.

SubJunk · October 2010

This is great, I was just about to make one myself and now I don't have to. Good work.

vegano · October 2010

does your attachment save the file path relative or absolute?
mine does it absolute, but rayk's scripts seemed to me, like his does it relative...

Will put in a test, to check this out before execute the script - so it won't matter, but would be fine to know.
I try to have the script ready until monday or thusday, tonight I would link to get the file stuff to work - on the weekend I won't do anything and on monday I will have to tidy up the code and make some thinks even more usable and easier.

Attachments to FileUpload Importer

Comments