Welcome to Esato.com




Is there a web page over 500Kb?

Click to view updated thread with images


Posted by sadeghi85
Is there a web page over 500Kb? (html only, excluding images, flash, etc etc)


Posted by jcwhite_uk
I doubt it. 500kb is a lot of HTML code

Posted by se_p800
Why do you want to know?

Posted by sadeghi85
I'm currently developing an app with PHP and i want to know what's the maximum size of regular web pages, so i can set a limit to not downloading web pages but interested files.("content type" won't work cos for example metacafe server will send "text/plain" for "flv" files that obviously they aren't plain text)

Posted by Johnex
Well, i usually serve all my files through php so i can limit the speed. Works for very large files, like 200mb+. The problem is that php has a timeout limit of 30seconds, depending on your server configuration.

[ This Message was edited by: Johnex on 2008-01-13 13:59 ]

Posted by sadeghi85
It's not a good decision IMO. client can't use "resuming" feature cos web server can't send "Content-Range" header, unless you implement this feature directly in php script that is hard( and why we should do that when the web server already supports it). consider a client on a slow connection(for example firefox trying to download an image) if it can't completely download that image, it will send another request with "Range" header to download remainder bytes. if you pass your files with php, you force client to download that file from beginning, overwhelming server.

regarding timeout limit you can use: @set_time_limit(0);
if "safe mode" is off. if you can access php.ini, set "max_execution_time" appropriately.

my point wasn't to limit speed.


EDIT:

Additionally this approach will prevent "caching" mechanism. client will send "If-Modified-Since" and server can't create "304 Not Modified", although the file is untouched, and again overwhelming server with unnecessary requests.(and my browser should download that pic every time i visit this page, so remove that pic please :) )




[ This Message was edited by: sadeghi85 on 2008-01-12 23:01 ]

Posted by Johnex
Dude, i know all this, i have worked with php for 6 years. I gave that as an example, since you want a site to be sent with php.

Posted by sadeghi85
Well, my second post was confusing. let me explain my situation again. i want to use php as a client, since C++ is too complicated for me to use. my app is like a downloader. it searches a web page for links and download certain files. i want it to download all files except those that are plain text in nature. since "Content-Type" won't work, i need a size limit so if a file is bigger than that limit it can be considered as a binary file. till now a limit of 500Kb worked for me. biggest html file i saw was about 350Kb somewhere on mp3.com.

and i'm curious why you want to limit speed?

Posted by ÈL ® ö B ì Ñ
Because you don't want 5 people downloading large files and maxing out the server connection and having the site run slower for everyone else just doing general browsing?


Just as an example.

Posted by Cycovision
You could always write a little PHP string handing function to find the dot(s) and determine the filename's extension that way?

Posted by sadeghi85
That way won't work. consider a link like this: http://example.com/?fileid=35465
it can be a file and there is no dot on it. so i should check "response header", and "content-disposition" isn't always available.



On 2008-01-13 18:35:41, ÈL ® ö B ì Ñ wrote:
Because you don't want 5 people downloading large files and maxing out the server connection and having the site run slower for everyone else just doing general browsing?


Just as an example.



using two host, one for large file storage and the other for general use is a better approach IMO. i stated disadvantages of passing file with php before.

Posted by Johnex
ÈL ® ö B ì Ñ is correct on my usage. Also, making it so members have faster download speeds is something that rapidshare uses, and works quite nicely, incentive to register.

I would go with what cyco suggested, but dont see how php will download multiple files for you, well, to the webserver its possible. I made a php based proxy a while back, and it opens a socket connection to a page, downloads the page, and edits all the links to go through the proxy. All files that are linked also go through the proxy without issues, gives the correct content type. This could be modified to use the socket to download to the actual server, but it wont be as effective as having a native c++ or c program.

[ This Message was edited by: Johnex on 2008-01-13 19:21 ]

Posted by sadeghi85
Rapidshare encourages users to get premium account by 3 strategies:
1. by passing the file through php(if they are using php):
*user wants to download a file. if he faces a connection drop he should download the file from beginning.
**he can't use a downloader(e.g firefox->flashgot->flashget) because downloader can't use the socket opened by browser and should open another socket and php script will recognize it as a separate request.
2. by blocking ip for e.g one hour.
3. by using CAPTCHA image to prevent automation.

these restrictions are annoying and force users to register. a regular download using a downloader like flashget is even faster. so passing file through php is to force users to register not for having faster download speeds for all.

actually i wrote the app for downloading from Rapidshare! (and other hosts that don't support resuming download).
till now my approach has been successful. i haven't any problem with the app. i just asked "Is there a web page over 500Kb?"

thanks for replies.

Posted by Johnex
http://pizzaseo.com/google-cache-maximum-file-size

Short and concise, yes there are.

Posted by sadeghi85
Thanks.

since Google is unable to go over 1MB cache limit, i'll set the limit to 1MB.

many many thanks, you solved my problem


Click to view updated thread with images


© Esato.com - From the Esato mobile phone discussion forum