Viewing 8 posts - 1 through 8 (of 8 total)
Author
Posts
  • #7937

    I’ve just experienced an interesting problem which prompts me to ask a couple of questions about the Clippings Cart. I suddenly found myself unable to upload media items – error message:

    Error uploading file; PHP failed to write to disk

    As this came hot on the heels of recent server modsecurity setting problems I immediately suspected a server parameter problem and chased the hosting service. They couldn’t find anything untoward but warned me that:

    …../public_html/kiwitrees/data is using around 6.5G of space.

    and said that this

    couldn’t be helping

    I took a look at the data directory and found about 30 Clippings Cart files, all dated 9 February and many very large eg 290 MB.
    Having been away for a week I know I was not active that day, so these ‘Clippjngs’ were certainly not of my doing. I deleted them and my media upload problem went away. I shall now try to work back through the logs to determine who was was doing what. In fact I rather foolishly deleted ALL the clipping files so I won’t be able to establish WHAT but should be able to find out WHO.
    Meanwhile, I was surprised that someone other than me (I’m the only one currently with Admin authority) could fill up the data directory. And the first question is:
    1. When a registered user of the site takes a ‘clipping’, after being downloaded to his system does it stay in the data directory until I remove it? I find this difficult to believe, as I have never had to do this before nor can I remember seeing other users clipping files there.
    2. If not how could all these clipping files appear in the data directory?
    Thanks for clarification Nigel.

    Ron in France Website: https://clan-davies.kiwitrees.net/ kiwitrees 3.3.11; PHP 8.0.14

  • #7938

    Have started working my way through the log for 9 February (kicking myself for not having kept at least one of the clipping files to give me a time stamp as there are thousands of entries – all that I have seen so far being attributable to four ip addresses all of which appear to be spiders/web crawler s – using SemrushBot, AhrefsBot, BingBot and Sogou web. So I must ask another question:
    3. Is it possible for web crawlers to generate Clippings files in the data directory? Surely it should not be?

    Ron in France Website: https://clan-davies.kiwitrees.net/ kiwitrees 3.3.11; PHP 8.0.14

  • #7942

    I took a look at the data directory and found about 30 Clippings Cart files, all dated 9 February and many very large eg 290 MB.

    How did you determine they were clippings cart files? If you are correct they would have been named clippings.ged, clippings(1).ged, clippings(2).ged etc…. or possibly .zip instead of .ged. Could they be something else completely?

    1. When a registered user of the site takes a ‘clipping’, after being downloaded to his system does it stay in the data directory until I remove it?

    No it isn’t (normally). I have just tested this on both my site and yours. On both, no clippings cart file is saved in the data directory or anywhere else. The code specifically creates the file, adds content to it, downloads it, closes it, then deletes it. It appears to be working correctly.

    2. If not how could all these clipping files appear in the data directory?

    My best guess would be that the error you first noticed has been around for a while. It might be possible the error, (exceeding your allotted memory / space) prevented the code from completing the deletion. (Note: this is a possible theory only – not tested). Does you web host have a backup of your site from just before you deleted those files? It would be good to look at one of them to see when created, what they contain, etc.

    … all that I have seen so far being attributable to four ip addresses all of which appear to be spiders/web crawler s – using SemrushBot, AhrefsBot, BingBot and Sogou web. So I must ask another question:
    3. Is it possible for web crawlers to generate Clippings files in the data directory? Surely it should not be?

    No it isn’t. I do not think these will be related to this problem. They can’t even access these pages, let alone make the right option choices to create the file.
    However, you might anyway want to think about whether you want those particular search engines crawling your site. They can be blocked via Administration > Site administration > Site access rules. (see http://kiwitrees.net/faqs/general-topics/site-access-rules/ for details)

    Regardless of whether this does turn out to be related to the clippings cart, I will investigate the feasibility of adding a log entry on record each time a clipping is created and downloaded, and by who.

    Nigel
    My personal kiwitrees site is www.our-families.info
  • #7946

    How did you determine they were clippings cart files? If you are correct they would have been named clippings.ged, clippings(1).ged, clippings(2).ged etc…. or possibly .zip instead of .ged.

    They were certainly all clippings xxxx.ged and . zip – I remember that – and that they were very large files – the last few were all about 290MB. And I think this is the clue. I have a feeling that when a certain size is reached and some size parameter somewhere is exceeded the download cannot be completed and the files remain on the server in the data directory. I have just tested the theory by selecting my two most distant ancestors and requesting all their descendants PLUS all the media items associated with them. Clicked Download – seemed to be working for a long time but nothing arrived on my PC – checked the server and there in the data directory were two clippings files – a 5.9MB .ged file and a 56.04MB .zip file. I repeated the exercise WITHOUT requesting zip file of all associated media items and the download was performed OK and nothing remained in the data directory.
    Tried various combinations. When the file size is sufficiently small that the download commences, tried selecting cancel on the download reference tab that appears at the bottom of the screen. Download was cancelled mid-flight but nothing in data directory. Suggests to me that it is only when the cause of the download failure is either within kiwitrees or server software that the files end up in the data directory and that external causes do not have this result.
    I have tried several times to repeat the creation of files in the data directory but have failed – there has been neither a successful download nor a termination with files in the data directory – the system has simply hung.
    Will do some further testing later and report back. For the moment I must leave it as I am late for an appointment.

    Ron in France Website: https://clan-davies.kiwitrees.net/ kiwitrees 3.3.11; PHP 8.0.14

  • #7947

    You must understand that it is called “clippings” for a reason. It is meant to imply the extraction of a SMALL part of a family tree 🙂

    The sizes you are referring to are well outside the design parameters, unless you have huge server resources, both in terms of the memory required to amass the necessary data to put in the file as well as the file space to save it. Zipping it is only a later option, designed to make the actual downloading over slower internet connections easier.

    Clearly your server cannot cope, but that is not unusual. You would be paying a very high price if it could.

    Your testing is simply demonstrating the way the code needs to work.
    Step one,, work out the connections to all individuals, families, sources, notes, media requested.
    Step two, gather all of the GEDCOM data and construct a fully formatted GEDCOM file of it in memory.
    Step three, create an empty file in the data directory.
    Step four, copy the data into the file
    Step five, convert the file into a zip folder.
    Step six, copy each linked media item into the zip folder, then close the file.
    Step seven, open a browser download window and let it do its thing to transfer the zip file to your desktop.
    Step eight, delete the folder.

    My (personal) recommendation would be to not worry about the process failing, as that is inevitable unless you want to pay for a massive increase in your server resources. Instead, try to work out why one or more users find it necessary to take away such large quantities of data?
    I overcame this simply on my site by removing access to the clippings cart from users below “manager” level. Only three of us out of 100+ have that level of access. They can still request the data, but not take whatever they think they need whenever the want. For my site that would be a privacy concern.

    Nigel
    My personal kiwitrees site is www.our-families.info
  • #7950

    You must understand that it is called “clippings” for a reason. It is meant to imply the extraction of a SMALL part of a family tree

    I couldn’t agree more Nigel – we should be talking clippings not entire forests!

    My (personal) recommendation would be to not worry about the process failing …… Instead, try to work out why one or more users find it necessary to take away such large quantities of data?

    Again I agree. Although I am a little alarmed that the consequences could be quite serious if someone takes it into their head to do this and the data directory fills up with many GBs of files that have failed to download – especially whilst I am away for a week as was the case this time. Do you think you could put some kind of warning message in the Clippings Cart intro text to steer people clear of trying to download most of the GEDCOM. – which seems to have happened on this occasion. It would be good if you could even say that those requiring a substantial part of the database to merge with their own files should not attempt to download via the clippings cart but to contact the site owner with a view to obtaining a full GEDCOM. I’m reluctant to turn off the facility for all users except managers because i know that it has been and continues to be used by a number of my users who maintain their own PC-based family trees.

    I have now identified the culprit, who should have known better (an IT professional working for Google!!) and was trying to use the tool to extract a substantial portion of the database and all the associated media items to build into a new system he was crating for his wife’s family. I’ve now sent him a full GEDCOM, whichhe should have asked for in the first place.

    Hopefully this is a one-off, but if consideration could be given to a message in the Clippings Cart text as i have suggested I think it would help to avoid any recurrence.

    Incidentally, having just consulted the data directory again to remove the exported GEDCOM I have just created, I see that the results of my tests this-morning – when the system appeared simply to hang – have finally appeared there – a number of enormous (over 300MB!!) zip files, taking my total server storage usage well over my allocated limit! Now deleted and hopefully not to be repeated.

    Ron in France Website: https://clan-davies.kiwitrees.net/ kiwitrees 3.3.11; PHP 8.0.14

  • #7951

    Hopefully this is a one-off, but if consideration could be given to a message in the Clippings Cart text as i have suggested I think it would help to avoid any recurrence.

    See above:

    Regardless of whether this does turn out to be related to the clippings cart, I will investigate the feasibility of adding a log entry on record each time a clipping is created and downloaded, and by who.

    It’s possibly over-kill, but I have done three things:

    1. I have as you suggest added a sentence to the text at the head of the cart page warning of the need to keep it small, and talk to admin about larger requirements.
    2. I have set up a log entry for each time a user downloads a cart file. It gives user name, IP, time of download etc as all log entries do.
    3. I have added an email message to admins whenever a user downloads a cart file. It just gives user name and the statement that they downloaded a cart file. This will be on by default, but admins will be able to turn it off if preferred (admin user settings)
    Nigel
    My personal kiwitrees site is www.our-families.info
  • #7952

    Super – belt and braces! That should do it!

    Many thanks Nigel.

    Ron in France Website: https://clan-davies.kiwitrees.net/ kiwitrees 3.3.11; PHP 8.0.14

Viewing 8 posts - 1 through 8 (of 8 total)

This topic is temporarily locked.