race history will never be part of the pubstats. Pubstats is already cause for 1/3 of all LFS World traffic and the races information would increase that heavily because there is so much information.
Best to take Dexter's advice and parse replays.
Would there be any way to give a query of a certain user (or even better - users (to reduce the connections made)) to see if he's online or not, and if he is the server info (or name, and then a seperate query for the server info)?
Atm i'm using the hosts script to get this info, but when refreshing every 60s it is still a rather big burden on the server i think (25kB/query). Atm i'm the only one using the program, but if i'd distribute it i'd imagine this becoming a problem... (systray program which gives a notice when a buddy comes online, and gives the server info, lets you join etc.)
Only the way you are using. There used to be a racer list for S1, but Victor decided not to make the same script available for S2.
IMO, for your app, you should have a 'proxy' that your application connects to. This could be a simple web script that fetches and caches lfsworld data every 60 seconds. Given 100 concurrent users, this would result in ~25k every 60 seconds from the lfsworld servers as opposed to 250k.
Obviously it's another stage to go wrong, but it creates an adhoc kind of distributed network which will allow the lfsworld servers to go a little further.
Good idea, the proxy could provide an interface with the stuff i want, so there wouldn't have to go 25k/60s from the proxy to every user. Only problem is, i don't have a webserver (which allows that kind of scripting) available. And since its something i think multiple apps could benifit of i'd rather have that at the side of lfsworld.
Another solution would be to put this in a local database : geting data can be automatised (say twice a day for example) and you local users connect to the database.
As open source fan, mySQL for db, and perl to fetch data could match perfectly together, even for a someone not expert in computers. Documentation is really very clear...
Which is effectively what Anarchi suggested, however its impractical for an "is my buddy online" system to only update twice a day.
My suggestion, would be to contact someone who maybe interested in having their logo on the application (i.e. In associate with X), and they provide web hosting for the "proxied cache". The proxied cache could then be modified so that only specific racer information is passed back down to the user. This cuts down on the information from LFSW, and from your proxied cache.
Perhaps lfs.fr or ocr, etc. maybe interested in this.
Meh, if someone needs a hosting package with the tipical stuff on there, all the bells & whistles etc, then i dont mind helping out.
assuming a ton of storage/bandwidth isnt going to be required then i see no reason why not, im currently only using what i have for testing purposes and theres about 95gb b/w going unused this month hehe (would be 25gb next month, relocating to a much faster & reliable datacenter/server).
As i say, i've plenty of bandwidth going unused, theres a shop going on there in about a months time and another sometime before xmas or whenever i get change to move it onto my server, and then the rest of it is usually un-used as what little im doing is tipically only seen by myself so not much traffic there.
Anyone else who might be interested in something like this can gimmi a shout if you wish, either PM or probably quicker via email to [email protected]
if theres some way of creating a 2nd source for the data we all want access to, then maybe its possible to organise this?
certainly would make more sense to all use out-sourced data rather than all attacking the same server.
Maybe im missing something, i guess i'd take the traffic hit on it rather than the lfs server, but if it was shared over multiple locations rather than each team attacking 1 location then maybe additional info could be made available if we free up the data line.
Very possible; I've thought about it myself, but we'd need to make sure that all of the sub-ordinate data sources were kept up to date with lfsworld, and that they all fetched from lfsworld, from which the 'clients' would use for a datasource instead of directly requesting from lfsworld.
ATM though, I'm not sure there is a need for a mirror service for LFSWorld stats, because I can't see many more sites requesting the data. Obviously only Victor really knows though. Maybe your thoughts on this Victor?
If it were to be implemented, we'd either need a cooperating standard, or a byte for byte mirror to ensure compatibility across the sources.
Could someone clarify me a bit how to parse replays? I made 'Output lap data', but I didn't find the .raf usable. Do I have to decode the .raf or is there another way to parse replays? Thanks in advance.
there's not really a need for a mirror so far. we can handle it and will be able to for a while
If it would really come to mirrors being needed, it should not be actually mirrored by downloading db-data. That would require huge downloading all the time (mirroring everything) and it wouldn't even be up to date to the minute. So I'm thinking more along the lines of that a mirror would process it's own stats like we do. The mirror would get a copy of the real-time statistics in binary format as they come in at our server. This data stream is very small so would be much more efficient and very real-time than mirroring a remote db by downloading it on intervals.
But as said, not really needed to worry about atm.
My thinking behind the mirrored data was so that the information that people would like to be able to access would be possible to get hold of, rather than the current 'would increase traffic too much' problem which is stopping us being able to access it.
If the data was updated at a minimum of every 15sec, and only when requested, then that wouldnt be too far from being live data. It'd just be nice to be able to use some 'live' stats.
I thought splitting the load so LFSW gets less of a hammering could possibly mean that making more information available wouldnt be such a problem, compared to every man and his dog heading to 1 source.
As a suggestion (in the unlikely case you've not thought of it); Database replication doesnt take much bandwidth - once the initial mirroring is done. I know it isnt very good at the moment within MySQL (no idea about Postgre either tbh), but I do have experience with it in the mighty and proprietory MSSQL (its the one thing MS have got right, along with AD), and its amazing. Assuming that MySQL / whatever you're using catches up / already supports it, it maybe something worth looking into if it is ever needed, perhaps?
Also, is it really a good idea to dump the entire list of teams to the user? Perhaps optional specifics maybe handy?
I've created an lfsw-ppf provider for the teams listing based on your spec, and it works great.
The only thing I'd suggest changing is the tarpit behaviour. ATM it's 5 seconds, sharing an access list with the hosts listing.
I think it would be more suitable to have something like 180 seconds, and having it's own access list.
i'm not bothered by that extra bandwidth - are you? (apart from taa)
I don't think it's really a _problem_ is it? It's just for the sake of simplicity that i keep it like this - you just get the whole hostlist and then you have all the info and you can do with it what you want. If that means throw away half the data, then so be it.
If there's someone having a real problem with it, write it down, but i can't imagine tbh.