geogphotos Posted August 9, 2019 Share Posted August 9, 2019 Somehow I have uploaded a few duplicates over the years. If I download my data from AIM is there an easy way to spot my duplicate files refs in the Excel file without doing it by eye? Some sort of command that operates down a row of data to spot my duplicate file names? Any ideas? Link to comment Share on other sites More sharing options...
ACC Posted August 9, 2019 Share Posted August 9, 2019 18 minutes ago, ACC said: From a data analytics point of view it’s very easy. I think the question is whether or not Alamy think thats a service they want to provide. If you want some XL type coaching I’m just across the river! I don’t know the Numbers spreadsheet though. Ian i’ve just done it in Numbers. Different to Excel but I got there, same COUNTIF function. Unfortunately no way to attach a screenshot. Email me if you want to take this offline before Alamy notice we’ve turned this board into an IT helpdesk! (sorry Alamy) ange.chalmers@btinternet.com Link to comment Share on other sites More sharing options...
John Richmond Posted August 10, 2019 Share Posted August 10, 2019 2 minutes ago, geogphotos said: Thanks to all of you for your help and comments. Angela has done the deed and I have been able to quickly delete loads of duplicates. Thank you John for your kind offer to do this for me. No problem, Ian. And well done Angela. Link to comment Share on other sites More sharing options...
geogphotos Posted August 10, 2019 Author Share Posted August 10, 2019 Thanks to all of you for your help and comments. Angela has done the deed and I have been able to quickly delete loads of duplicates. Thank you John for your kind offer to do this for me. Link to comment Share on other sites More sharing options...
Harry Harrison Posted August 10, 2019 Share Posted August 10, 2019 53 minutes ago, MDM said: Lightroom may not get confused with duplicate names but it is possible that you could mess up a backup Thanks, I take your point but my backup software hasn't had a problem so far fortunately. I did consider adding a suffix to the 'rolled over' images but it wouldn't help with those already up there on Alamy so I think I still couldn't use the LR/Alamy plugin though I may need to read up about that again, I think he does describe some kind of workaround. Incidentally, in Filemaker finding duplicates is very easy, just search with a '!' in the desired field. In fact looking at the manual for Alamy-Lightroom Bridge here I can see that although I have made life difficult for myself I can get around it. From the csv export I will know which dupes I have to contend with and then I simply (?) manually enter the unique Alamy ref into the requisite field in Lightroom using a visual comparison in AIM. Once that is done I can do as you suggest and add a suffix so all my LR filenames are unique and then make sure my cameras don't add more dupes. Link to comment Share on other sites More sharing options...
MDM Posted August 10, 2019 Share Posted August 10, 2019 44 minutes ago, Harry Harrison said: I imagine that I could/should have changed my camera settings to to roll over at 99,999 rather than 9,999 but in practice I don't find it a workflow problem so I've never bothered, a bit late to do so now anyway. On the other hand I believe that the LR/Alamy plugin relies on unique filenames in Lightroom to match up with those on Alamy so that is a limitation as I quite fancy trying it. Lightroom itself doesn't of course treat images with the same filename as a duplicate when importing with the "Don't import suspected duplicates" check-box checked as it looks at other parameters as well, "Lightroom determines a photo is a duplicate of another file in the catalog if it has the same, original filename; the same Exif capture date and time; and the same file size." Lightroom may not get confused with duplicate names but it is possible that you could mess up a backup (manual or automatic) with duplicate filenames. It is not a good idea in general in my opinion. You can use the Rename function in Lightroom to rename a batch with Custom Name - Sequence in a matter of seconds. Link to comment Share on other sites More sharing options...
Harry Harrison Posted August 10, 2019 Share Posted August 10, 2019 19 minutes ago, spacecadet said: My only defence against duplicates is working through images when new I imagine that I could/should have changed my camera settings to to roll over at 99,999 rather than 9,999 but in practice I don't find it a workflow problem so I've never bothered, a bit late to do so now anyway. On the other hand I believe that the LR/Alamy plugin relies on unique filenames in Lightroom to match up with those on Alamy so that is a limitation as I quite fancy trying it. Lightroom itself doesn't of course treat images with the same filename as a duplicate when importing with the "Don't import suspected duplicates" check-box checked as it looks at other parameters as well, "Lightroom determines a photo is a duplicate of another file in the catalog if it has the same, original filename; the same Exif capture date and time; and the same file size." Link to comment Share on other sites More sharing options...
spacecadet Posted August 10, 2019 Share Posted August 10, 2019 11 hours ago, Harry Harrison said: Certainly for me duplicate filenames do not (necessarily) mean duplicate images, I have a lot of duplicate filenames on Alamy as both my Canon & Fuji have gone round the clock. However I use Lightroom collections for my Alamy uploads so it's relatively easy to identify any that have been uploaded by mistake, and relatively difficult to do so in the first place, though I have a tiny fraction of your port. Same here. I have three or four of some, but inlike we two most here probably don't use OOC filenames unchanged for uploading. LR will change them on export for you. My only defence against duplicates is working through images when new, and even then it doesn't always work. Link to comment Share on other sites More sharing options...
John Richmond Posted August 10, 2019 Share Posted August 10, 2019 8 hours ago, ACC said: Haha looks like John and i should start a consultancy 😄 Been there, done that. Now retired and nothing would drag me back except for a couple of old clients whose Excel solutions I set up and still maintain. Link to comment Share on other sites More sharing options...
Harry Harrison Posted August 9, 2019 Share Posted August 9, 2019 58 minutes ago, geogphotos said: Would this be something that Alamy could easily do and then inform those contributors who ask? Certainly for me duplicate filenames do not (necessarily) mean duplicate images, I have a lot of duplicate filenames on Alamy as both my Canon & Fuji have gone round the clock. However I use Lightroom collections for my Alamy uploads so it's relatively easy to identify any that have been uploaded by mistake, and relatively difficult to do so in the first place, though I have a tiny fraction of your port. Link to comment Share on other sites More sharing options...
ACC Posted August 9, 2019 Share Posted August 9, 2019 Haha looks like John and i should start a consultancy 😄 Link to comment Share on other sites More sharing options...
John Richmond Posted August 9, 2019 Share Posted August 9, 2019 25 minutes ago, geogphotos said: I'm overwhelmed. Would this be something that Alamy could easily do and then inform those contributors who ask? Or would it be a lot of work? How easy is easy? Very easy - if you know what you are doing. 5 minutes max. But I don't see Alamy offering the service unless they're prepared to code it into their download facility. In the meantime if you want to send a copy of the download file to john@tibouchinatraining.co.uk I'll find the 5 minutes over the weekend. Link to comment Share on other sites More sharing options...
ACC Posted August 9, 2019 Share Posted August 9, 2019 4 minutes ago, geogphotos said: I'm overwhelmed. Would this be something that Alamy could easily do and then inform those contributors who ask? Or would it be a lot of work? How easy is easy? From a data analytics point of view it’s very easy. I think the question is whether or not Alamy think thats a service they want to provide. If you want some XL type coaching I’m just across the river! I don’t know the Numbers spreadsheet though. Link to comment Share on other sites More sharing options...
geogphotos Posted August 9, 2019 Author Share Posted August 9, 2019 I'm overwhelmed. Would this be something that Alamy could easily do and then inform those contributors who ask? Or would it be a lot of work? How easy is easy? Link to comment Share on other sites More sharing options...
John Richmond Posted August 9, 2019 Share Posted August 9, 2019 Two good approaches from Mark and Angela (ACC). I'll agree with Mark's COUNTIF modification. I've found that using A:A rather than absolute references can occasionally cause problems in Excel (though it does work most of the time). Another alternative is to use a Pivot table to identify duplicates. You'd only need the column containing the duplicate references and a count column sorted by high to low values to easily pick out the culprits. Once you've found them in the main list with any of the methods it's then easy to pick up the Alamy ID and search and then delete in AIM. Link to comment Share on other sites More sharing options...
M.Chapman Posted August 9, 2019 Share Posted August 9, 2019 3 hours ago, ACC said: In Excel i’d say COUNTIF function on the column holding the identifying values. Then a simple filter for results >1. I don’t know Numbers but assume it has a help index and a similar function? =COUNTIF (A:A,A1) A:A = range, A1 (the value in A1) = criteria That's neat. In Excel I think the range would need the numeric row number references of the first and last row prefixed with a dollar symbol. For example, assuming there are 5000 rows of data with the filename in 1st column (A) and there's 1 header row, these equations will be in in rows 2 and 3. =COUNTIF(A$2:A$5000, A2) =COUNTIF(A$2:A$5000, A3) The $ prefix will ensure that when cut and paste is used to fill the column with the equation the range reference remains locked. Mark Link to comment Share on other sites More sharing options...
ACC Posted August 9, 2019 Share Posted August 9, 2019 In Excel i’d say COUNTIF function on the column holding the identifying values. Then a simple filter for results >1. I don’t know Numbers but assume it has a help index and a similar function? =COUNTIF (A:A,A1) A:A = range, A1 (the value in A1) = criteria Link to comment Share on other sites More sharing options...
meanderingemu Posted August 9, 2019 Share Posted August 9, 2019 1 hour ago, M.Chapman said: 1) Sort on the column containing the filenames, so that the duplicates will appear on successive rows 2) Add a new column 3) Fill with equation from row 2 downwards along the lines of Cell = IF (Filename in Row1 = Filename in Row2 , "X","") In Excel, if the filenames are in Column A, the equation in Row 2 of the new column would look like this =IF(A1=A2,"X","") Copy and paste the formula to fill the column and the equation will auto-increment the row numbers for you. 4) This will put an "X" in the rows containing a filenames that is a duplicate of the row before, making them easy to spot 5) You can then sort on new column to get all the duplicates in a condensed list Mark more or less what i was going to propose (using 0s and 1s) , but using filter in point 5, instead on re sort Link to comment Share on other sites More sharing options...
M.Chapman Posted August 9, 2019 Share Posted August 9, 2019 41 minutes ago, geogphotos said: Somehow I have uploaded a few duplicates over the years. If I download my data from AIM is there an easy way to spot my duplicate files refs in the Excel file without doing it by eye? Some sort of command that operates down a row of data to spot my duplicate file names? Any ideas? 1) Sort on the column containing the filenames, so that the duplicates will appear on successive rows 2) Add a new column 3) Fill with equation from row 2 downwards along the lines of Cell = IF (Filename in Row1 = Filename in Row2 , "X","") In Excel, if the filenames are in Column A, the equation in Row 2 of the new column would look like this =IF(A1=A2,"X","") Copy and paste the formula to fill the column and the equation will auto-increment the row numbers for you. 4) This will put an "X" in the rows containing a filenames that is a duplicate of the row before, making them easy to spot 5) You can then sort on new column to get all the duplicates in a condensed list Mark Link to comment Share on other sites More sharing options...
geogphotos Posted August 9, 2019 Author Share Posted August 9, 2019 12 minutes ago, spacecadet said: I haven't looked at it in detail, but this https://analysistabs.com/vba/find-duplicate-values-column/ has a macro for finding duplicate strings. Thanks for the link though it looks hard to me. I did mean column not row, and actually I use Numbers not Excel. So in principle it is possible. I found this: How do I find duplicates in the same column? To do so: Select the range of cells you wish to test. ... On Excel's Home tab, choose Conditional Formatting, Highlight Cells Rules, and then Duplicate Values. Click OK within the Duplicate Values dialog box to identify the duplicate values. Duplicate values in the list will now be identified. Link to comment Share on other sites More sharing options...
spacecadet Posted August 9, 2019 Share Posted August 9, 2019 I haven't looked at it in detail, but this https://analysistabs.com/vba/find-duplicate-values-column/ has a macro for finding duplicate strings. Link to comment Share on other sites More sharing options...
Question
geogphotos
Somehow I have uploaded a few duplicates over the years.
If I download my data from AIM is there an easy way to spot my duplicate files refs in the Excel file without doing it by eye?
Some sort of command that operates down a row of data to spot my duplicate file names?
Any ideas?
Link to comment
Share on other sites
20 answers to this question
Recommended Posts
Archived
This topic is now archived and is closed to further replies.