Check file before Upload

renato1723

New Member
YetiShare User
Aug 22, 2012
29
0
0
Dear Mrs,

UPLOAD Before we can check if a file already exists with the same name and with the same exact size (bits) and if it exists, not to upload, just pass a new URL pointing to the file that is already in storage. Is this possible?

Thanks
 

adam

Administrator
Staff member
Dec 5, 2009
2,043
108
63
Hi,

It's not possible I'm afraid. The only real way to check for the existence of a file is to use a hash of the file contents and you need the whole file on the server to do it.

Regards,
Adam.
 

renato1723

New Member
YetiShare User
Aug 22, 2012
29
0
0
Hello Adam,

The Megaupload did a scan with a percentage of the uploaded file. I do not know how, but you sent a 1GB file and in a few minutes, he gave as the upload completes. At the time I did a test by sending a single file, it took eight hours. The next day I went to send the same file and sent it in 22 minutes.

It's impossible to do a check with a piece of the file?

If not, even if the file is to be sent, we can not do the hash to avoid overuse in storage? The most expensive part of storage as it is not in Internet bandwidth, the capacity of the disk or in the storage means.

---
I have an idea.

How about making a hash on the file and store the result in the database? So the next before being stored files (even in temp dir) shows the name, the hash value and size. If equal, no store, just direct a noma download URL for the file already exists.

This new technology does not seem very hard to develop and will raise much intelligence program.
---

Best Regards
 

adam

Administrator
Staff member
Dec 5, 2009
2,043
108
63
Hi,

I'm not sure how megaupload did it, but it's not doable as far as I'm aware. Unless it was using a hidden Flash element to read the information somehow.

It is possible to adjust the script to not store duplicate files however it's a big job to amend the code. i.e. when someone deletes a file you need to make sure no-one else links to it before removing it. The checking for duplicates would need to be bullet proof aswell since you wouldn't want files accidentally assumed the same.

It's on the list though to look at though at some stage.

Thanks,
Adam.
 

mfa9884569

New Member
YetiShare User
Dec 22, 2014
6
1
1
Hi,

Could I know is this function in plan now?

It's a big progress on upload experience if there's duplicate file detecting.
 

adam

Administrator
Staff member
Dec 5, 2009
2,043
108
63
Hi,

We've had duplicate file checking for a while now, so the same file isn't stored twice on your server.

Thanks,
Adam.
 

enricodias4654

Member
YetiShare User
Jan 13, 2015
411
1
16
Hello Adam.

It is possible. There are several ways to do it.

- HTML5 already supports chucked file uploads as far as I know.
- It may be possible to put a software in the client to calculate the md5 before sending the file.
- The php may try to read the temp file during the upload. I'm not sure if this will work.
- Upload the file using the PUT method.

It may be possible to do something like this with the PUT method:

$putdata = fopen("php://input", "r");

$fp = fopen("temp_file_name", "w");

// loop reading 1kb at a time, but I think the php will read the entire package, mtu size.
while ($data = fread($putdata,1024)) {
fwrite($fp,$data);
// at some point, check the partial md5 and break the while.
}

fclose($fp);
fclose($putdata);

You will need to add "Script PUT /file_to_handle_uploads.php" on the apache config.
I also don't know how the browsers will react if the server closes the connection in the middle of the upload.

Many users talk about checking the md5 before the upload but it's the first time I see someone suggesting checking the md5 of a partially uploaded file. I will try to implement this. Thanks a lot for the idea, renato1723.
 

enricodias4654

Member
YetiShare User
Jan 13, 2015
411
1
16
I tested an ajax upload using the PUT method. It works! Check the html file in the attachments.

Tested only on firefox 42. Use firebug to see the requests going.
 

Attachments