Jump to content

delete orphaned files/images from site/assets/files


interrobang

Recommended Posts

imgs being created is the issue. So for every width(), height(), size(), function used in a template or in the editor creates an image, and other modules might make more. Change that image (say upload a different one) and now you have maybe one image and 3 variations that now dont need to be there in files as it isn't used anywhere in pw and therefore within the DB.

Be nice to have a module that reads /files and compares against anything in pw and deletes if you want it too.

Link to comment
Share on other sites

I think too, that actually a general maintenance module would be great.

There are some maintenance tips and pieces spread over this forum. About cleaning unneeded assets dirs, and so on, cleaning up databases, there even is some extendable monitoring + maintenance module already there - without much functionality yet, but anyway.

(For WordPress there is a modle that can force rebuilding image variations/thumbs. I found that useful in some situations. I am missing this sort of technology here, too.)

some examples:

https://processwire.com/talk/topic/4437-delete-orphaned-filesimages-from-siteassetsfiles/

https://processwire.com/talk/topic/9383-search-and-replace/?hl=search

http://modules.processwire.com/modules/process-diagnostics/ (can cleanup databases)

  • Like 4
Link to comment
Share on other sites

  • 9 months later...

I have a script derived from this thread that used to work but now I'm trying to run it on a dev branch project and I get this error message:

Quote

Fatal error:  Call to undefined function wire() in freshthumbs.php on line 4

Here's the code:

<pre><?php
ini_set('max_execution_time', 60 * 5); // 5 minutes, increase as needed
include("./index.php");
$dir = new DirectoryIterator(wire('config')->paths->files);

foreach ($dir as $file) {
    if ($file->isDot() || !$file->isDir()) {
        continue;
    }
    $id = $file->getFilename();
    if (!ctype_digit("$id")) {
        continue;
    }
    $page = wire('pages')->get((int) $id);
    if (!$page->id) {
        echo "Orphaned directory: " . wire('config')->urls->files . "$id/" . $file->getBasename() . "\n";
        continue;
    }
    // determine which files are valid for the page
    $valid = array();
    foreach ($page->template->fieldgroup as $field) {
        if ($field->type instanceof FieldtypeFile) {
            foreach ($page->get($field->name) as $file) {
                $valid[] = $file->basename;
                if ($field->type instanceof FieldtypeImage) {
                    foreach ($file->getVariations() as $f) {
                        //$valid[] = $f->basename;
                    }
                }
                // keep thumbnails:
                /*
                if ($field->type instanceof FieldtypeCropImage) {
                    $crops = $field->getArray();
                    $crops = $crops['thumbSetting'];
                    $crops_a = explode("\n", $crops); // ie. thumbname,200,200 (name,width,height)
                    foreach ($crops_a as $crop) {
                        $crop = explode(",", $crop);
                        $prefix = wire('sanitizer')->name($crop[0]);
                        $valid[] = $prefix . "_" . $file->basename;
                    }
                }*/
            }
        }
    }
    // now find all the files present on the page
    // identify those that are not part of our $valid array
    $d = new DirectoryIterator($page->filesManager->path);
    foreach ($d as $f) {
        if ($f->isDot() || !$f->isFile()) {
            continue;
        }
        if (!in_array($f->getFilename(), $valid)) {
            echo "Orphaned file: " . wire('config')->urls->files . "$id/" . $f->getBasename() . "\n";
//             unlink($f->getPathname());
        }
    }
    wire('pages')->uncache($page); // just in case we need the memory
}

I tried preceding the wire in mention with a $ and also with Processwire/ respectively. Processwire/wire returns this: 

Quote

Notice:  Use of undefined constant Processwire - assumed 'Processwire' in freshthumbs.php on line 4

Fatal error:  Call to undefined function wire() in freshthumbs.php on line 4

and $wire :

Quote


Fatal error:  Function name must be a string in freshthumbs.php on line 4

Why is this?

Link to comment
Share on other sites

4 hours ago, flydev said:

on dev branch you need to add the namespace.

Thanks @flydev, can you (or anyone) explain why it's necessary to manually add the Processwire namespace to this file? I thought the file compiler was supposed to take care of this automatically. If not, how do we know which files need to have the namespace added manually and which are compiled automatically?

Link to comment
Share on other sites

  • 2 months later...

Why is it throwing out the following error?

Spoiler

Warning: Unknown: It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/pleasefi/public_html/clients/csdc-ganesh/clean-files.php on line 2

Fatal error: Namespace declaration statement has to be the very first statement in the script in /home/pleasefi/public_html/clients/csdc-ganesh/clean-files.php on line 2

 
 
 
 
 
 
 
Link to comment
Share on other sites

clean-file.php this is the code

    <pre>
    <?php namespace Processwire;
    ini_set('max_execution_time', 1200); // 5 minutes, increase as needed
    include("./index.php"); 
    $dir = new DirectoryIterator(wire('config')->paths->files); 
     
    foreach($dir as $file) {
      if($file->isDot() || !$file->isDir()) continue; 
      $id = $file->getFilename();
      if(!ctype_digit("$id")) continue;  
      $page = wire('pages')->get((int) $id);
      if(!$page->id) {
        echo "Orphaned directory: " . wire('config')->urls->files . "$id/" . $file->getBasename() . "\n";
        continue; 
      }
      // determine which files are valid for the page
      $valid = array();
      foreach($page->template->fieldgroup as $field) {    
        if($field->type instanceof FieldtypeFile) {
          foreach($page->get($field->name) as $file) {
            $valid[] = $file->basename; 
            if($field->type instanceof FieldtypeImage) {
              foreach($file->getVariations() as $f) {
                $valid[] = $f->basename; 
              }
            }
          }
        }
      } 
      // now find all the files present on the page
      // identify those that are not part of our $valid array
      $d = new DirectoryIterator($page->filesManager->path); 
      foreach($d as $f) {
        if($f->isDot() || !$f->isFile()) continue; 
        if(!in_array($f->getFilename(), $valid)) {
          echo "Orphaned file: " . wire('config')->urls->files . "$id/" . $f->getBasename() . "\n";                               
          // unlink($f->getPathname()); 
        }
      }
      wire('pages')->uncache($page); // just in case we need the memory
    }
    ?>
    </pre>
  

 

Link to comment
Share on other sites

1 hour ago, Pravin said:

Why is it throwing out the following error?

  Reveal hidden contents

Warning: Unknown: It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/pleasefi/public_html/clients/csdc-ganesh/clean-files.php on line 2

Fatal error: Namespace declaration statement has to be the very first statement in the script in /home/pleasefi/public_html/clients/csdc-ganesh/clean-files.php on line 2

 
 
 
 
 
 
 

@Pravin

You need to remove the html code - first (<pre>) and last lines (</pre>) in your clean-file.php.

  • Like 1
Link to comment
Share on other sites

  • 5 months later...

For some reason the script doesn't find anything for me when checking for the correct Fieldtype.  $field->type instanceof FieldtypeImage and $field->type instanceof FieldtypeFile never returns true for any of my Image fields, but $field->type->name == "FieldtypeImage" does.  My working validation loop looks like this:

  $valid = array();
  foreach($page->template->fieldgroup as $field) {   
    foreach($page->get($field->name) as $file) {
      $valid[] = $file->basename;
      if($field->type->name == "FieldtypeImage") {
        foreach($file->getVariations() as $f) {
          $valid[] = $f->basename; 
        }
      }
    }
  }

I'm running PW 3.0.42, PHP v5.5.27.

Link to comment
Share on other sites

  • 1 month later...
  • 3 years later...

When using the Processwire namespace I also needed to add a leading backslash to the DirectoryIterator class so it can access the global namespace, in case someone needs to use it with the latest PW and PHP versions.

<pre>
<?php namespace Processwire;

ini_set('max_execution_time', 60*5); // 5 minutes, increase as needed
include("./index.php"); 
// Add a leading back slash to the class so it can access the global namespace
$dir = new \DirectoryIterator(wire('config')->paths->files); 
 
foreach($dir as $file) {
  if($file->isDot() || !$file->isDir()) continue; 
  $id = $file->getFilename();
  if(!ctype_digit("$id")) continue;  
  $page = wire('pages')->get((int) $id);
  if(!$page->id) {
    echo "Orphaned directory: " . wire('config')->urls->files . "$id/" . $file->getBasename() . "\n";
    continue; 
  }
  // determine which files are valid for the page
  $valid = array();
  foreach($page->template->fieldgroup as $field) {    
    if($field->type instanceof FieldtypeFile) {
      foreach($page->get($field->name) as $file) {
        $valid[] = $file->basename; 
        if($field->type instanceof FieldtypeImage) {
          foreach($file->getVariations() as $f) {
            $valid[] = $f->basename; 
          }
        }
      }
    }
  } 
  // now find all the files present on the page
  // identify those that are not part of our $valid array
  // Add a leading back slash to the class so it can access the global namespace
  $d = new \DirectoryIterator($page->filesManager->path); 
  foreach($d as $f) {
    if($f->isDot() || !$f->isFile()) continue; 
    if(!in_array($f->getFilename(), $valid)) {
      echo "Orphaned file: " . wire('config')->urls->files . "$id/" . $f->getBasename() . "\n";                               
      // unlink($f->getPathname()); 
    }
  }
  wire('pages')->uncache($page); // just in case we need the memory
}
?>
</pre>

 

  • Like 2
Link to comment
Share on other sites

  • 1 year later...
On 9/8/2013 at 2:13 PM, ryan said:

I sometimes end up with orphaned files as a result of doing mass imports during development. My code won't be quite right the first time around and I'll end up with extra and/or duplicated files. At least that was the case this last week. It was on a pretty large scale, so not something I wanted to clean up manually. Here's how I cleaned them out. Place this in a file in your site root called clean-files.php and then load it in your browser. 

/clean-files.php

 

Hi,

I tried this because of troubles with orphaned files and repeater data and so on (see Orphaned repeater data?)

I created the file clean-files.php in the root and pasted the code as given.

The result is...

Warning:  session_set_save_handler(): Session save handler cannot be changed after headers have already been sent in /var/www/html/wire/core/WireSessionHandler.php on line 57
Warning:  session_name(): Session name cannot be changed after headers have already been sent in /var/www/html/wire/core/Session.php on line 291
Warning:  ini_set(): Session ini settings cannot be changed after headers have already been sent in /var/www/html/wire/core/Session.php on line 294
Warning:  ini_set(): Session ini settings cannot be changed after headers have already been sent in /var/www/html/wire/core/Session.php on line 295
Warning:  ini_set(): Session ini settings cannot be changed after headers have already been sent in /var/www/html/wire/core/Session.php on line 296
Warning:  ini_set(): Session ini settings cannot be changed after headers have already been sent in /var/www/html/wire/core/Session.php on line 297
Warning:  ini_set(): Session ini settings cannot be changed after headers have already been sent in /var/www/html/wire/core/Session.php on line 309
Fatal error:  Uncaught Error: Call to undefined function wire() in /var/www/html/clean-files.php:7
Stack trace:
#0 {main}
  thrown in /var/www/html/clean-files.php on line 7
Warning:  Cannot modify header information - headers already sent by (output started at /var/www/html/clean-files.php:1) in /var/www/html/wire/core/WireHttp.php on line 1688

I also tried it with namespace ProcessWire

 

Link to comment
Share on other sites

@Carsten The echo '<pre>' part comes too early. Change the first few lines to:

<?php namespace Processwire;
ini_set('max_execution_time', 60 * 5); // 5 minutes, increase as needed
include("./index.php");
echo '<pre>'; // rest below unchanged:
// Add a leading back slash to the class so it can access the global namespace
$dir = new \DirectoryIterator(wire('config')->paths->files);

Tried the script like this and it works just fine (PW 3.0.205)

  • Like 1
Link to comment
Share on other sites

2 hours ago, dragan said:

@Carsten The echo '<pre>' part comes too early. Change the first few lines to:

<?php namespace Processwire;
ini_set('max_execution_time', 60 * 5); // 5 minutes, increase as needed
include("./index.php");
echo '<pre>'; // rest below unchanged:
// Add a leading back slash to the class so it can access the global namespace
$dir = new \DirectoryIterator(wire('config')->paths->files);

Tried the script like this and it works just fine (PW 3.0.205)

@dragan

Thank you, now orphaned directories are shown but not from repeaters.

I deleted all pages and repeaters from pages and there still are some folders which are not shown as orphaned.

I tried it with manually creating a folder e.g. "9999" and it was shown as orphaned. Then I created manually a file in a not orphaned folder and the file was shown as orphaned. So far so good.

But these repeater stuff doesn't disapear and I don't know how to delete that stuff

 

id: 3538
name: 1667137707-9667-6

ProcessWire\RepeaterPage Object
(
    [id] => 3538
    [name] => 1667137707-9667-6
    [parent] => /ppww/repeaters/for-field-155/for-page-0/
    [template] => repeater_blog_repeater
    [name1110] => 
    [status1110] => 1
    [data] => Array
        (
            [name1110] => 
            [status1110] => 1
        )

)
Link to comment
Share on other sites

16 hours ago, Carsten said:

@dragan

Thank you, now orphaned directories are shown but not from repeaters.

I deleted all pages and repeaters from pages and there still are some folders which are not shown as orphaned.

I tried it with manually creating a folder e.g. "9999" and it was shown as orphaned. Then I created manually a file in a not orphaned folder and the file was shown as orphaned. So far so good.

But these repeater stuff doesn't disapear and I don't know how to delete that stuff

 

id: 3538
name: 1667137707-9667-6

ProcessWire\RepeaterPage Object
(
    [id] => 3538
    [name] => 1667137707-9667-6
    [parent] => /ppww/repeaters/for-field-155/for-page-0/
    [template] => repeater_blog_repeater
    [name1110] => 
    [status1110] => 1
    [data] => Array
        (
            [name1110] => 
            [status1110] => 1
        )

)

I did not solve it, so I made tests with Fieldset (Page). There also is a Repeaters section but with my fieldset.

image.png.48a70f1273a2d83989c2ea92e3cb9a41.png

 

It's not what I wanted but the Fieldsets are deleted when I delete a page manually or via api so that there's no waste with orphaned  folders and files.

The stuff with the orphaned files and folders and entries in admin/Repeaters/ ... still exists and could not be detected with Ryans script (clean-files.php).

Could this then be a bug?

Thank you for your help

Carsten

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...