So I learned how to delete hidden file earlier today with find and rm. I know how to write a shell script. I have file uploaded by sftp clients in /home/incoming/images/ dir. Everyday they upload tons of files. Those file processed withing day or two by another python script. My directory is becoming so big. A solution was suggested that I delete files older than 30 days. rm command man page says they do not know such option. They don’t have it such option on CentOS rm command. So is there any way I can write shell script to delete files older than 30 days from /home/incoming/images/ ?
No need to write shell script. Just use find with -mtime option:
CentOS Linux: Delete All *.png Files Older Than 30 Days
find /home/incoming/images/ -type f -iname '*.png' -mtime +30 -delete
Of course you can use rm command:
find /home/incoming/images/ -type f -iname '*.jpg' -mtime +30 -exec rm {} \;
Delete all files in /home/incoming/images/ older than 30 days:
find /home/incoming/images/ -type f -iname '*' -mtime +30 -exec rm -v "{}" +
Setup cron job to clean folder daily
You might want to setup a cron job as described here:
@daily /usr/bin/find /home/incoming/images/ -type f -iname '*' -mtime +30 -delete
Making use of xargs
As pointed out by @allengarvin there a way to be more efficient in the use of fork() and exec() without running up against limits. Here is combination of find and xargs:
find /path/to/my/stuff -mtime +30 -print | xargs /bin/rm
find /home/incoming/images/ -type f -name '*' -mtime +30 -print | xargs /bin/rm
## Deal with unusual characters in filenames ##
find /home/incoming/images/ -type f -name '*' -mtime +30 -print0 | xargs -I {} -0 /bin/rm "{}"
Again, read man pages to understand all options:
man find
man xargs
It might be nice to also discuss why you might prefer one or the other. Originally with find, the only way you would have been able to do this with standard command line tools would have been find path -exec rm {} ;. That existed in UNIX version 7, way back in 1979. It could have been quite slow and expensive because every rm would have been a separate fork(2)/exec(2) call, and each fork would copy the entire address space of the process.
Later, 3BSD introduced vfork, which would have improved performance a lot by doing copy-on-write, but it’d still be separate forks and exec for every command. Then 4.3BSD introduced the xargs command, which accepts strings through standard in, and presents them as arguments with ‘xargs cmd’, to the max number of commands for the line, and would only fork another process when there is remaining stdin and the max environment is reached. Thus, you would have find find path [criteria] -print | xargs rm.
Doing the command via rm {} ; should still be considered relatively expensive, and I wouldn’t use it unless with a simple command like rm, but there might be other instances where you want to do something complicated there.
I’m not sure when -delete came about. I don’t think it was present in 4.4BSD, nor is it present in Posix. If you have -delete, that’s the definite high-performing version. If you just have xargs, then pipe to xargs (adding --no-run-if-empty if the GNU extensions are available).
I updated my answer.
remember also with ‘find’ and xargs, you can do some more interesting things.
When manipulating uploaded large images I run a find that then creates a smaller thumbnail of the same name in a subdir - using xargs and an ImageMagik script.
On getting a multicore processor, I went all out parallel for that process.
Gorra love unix
Difference b/w exec and xargs
- exec
find /home/user/logs -type f -name '*.log' -mtime +30 -exec rm -f { } \;
exec will remove logs something like this
rm -f something1.log
rm -f something2.log
rm -f somethng3.log
- xargs
find /home/user/logs -type f -name '*.log' -mtime +30 | xargs rm -f { } \;
xargs will remove logs like below
rm -f something1.log something2.log something3.log
find
can do it too:
find ... -exec rm -f {} +