Duplicate Removal - PERL

Question

Salut à tous,

I wanted to get your opinion on the best way to remove duplicates from a list of strings...

This is what I'm doing for now (it's not great and that's why I'm posting ;)

1) I'm storing them in a simple array (not associative)

push(@stock, $elem);

2) I sort them

@stock=sort(@stock);

3) I go through all the elements and check if the next one is identical. If identical, I assign it the value ""

for ($i=0; $i < @stock; $i++)
{
if(@stock>$i+1)
{
($stock[$i]eq$stock[$i+1])?$stock[$i]='':$stock[$i];
}
}

4) I sort again (I warned you... my algo is really not great) :(

@stock=sort(@stock);

5) Then, again, I go through the array while removing elements when I encounter an empty string... if I find something different, I exit...

while (@sender[0]eq'')
{
shift(@sender);
if (!@sender)
{
last;
}
}

... all this to get rid of duplicates.... :(
So if someone sees a more optimized way than mine... something like giving me a function in Perl eradikerdoublons() or (because I've looked, this super function doesn't exist in Perl ;) improving my algo... it would be nice if you could explain how to me ;)

nenecg · Accepted Answer

Here is a solution  my (%saw,@out)=(); undef %saw; @out = sort(grep(!$saw{$_}++, @stock));  the @out array contains the list without duplicates.

bibi · Answer

Hey!!!!

You're not going to tell me that my code is super optimized, are you???? ;-)

If you don't know PERL, it's okay, just write to me in the language you know best or in pseudocode, or simply in French, what you would do to improve my code (I'll take care of transforming it into PERL)

I'm eagerly awaiting your suggestions, thanks :)

sebsauvage · Answer

And why not place your strings in a hash table? It makes it very easy to eliminate duplicates.  (I can't remember the syntax in Perl.)

sebsauvage · Answer

To give the example in Python, a program that takes a file A.txt, removes duplicates, sorts, and writes the result to B.txt (I'll keep it compact):  items = dict( [ (line,0) for line in open('A.txt','rb').read().split('
')] ).keys() items.sort() open('B.txt','w+b').write('
'.join(items))  Isn't it beautiful, huh? Three little lines of code? (Okay, I agree it's not great for readability if you don't know Python :)

Bobinours · Answer

I had the same idea as sebsauvage, using a HASH table.

my %h_senders;
loop over the elements (condition) { # It's up to you to define it
$h_senders[$elem]++;
}

By the way, this allows you to know the number of occurrences of the element (thanks to the increment).

So if you display:

print $h_senders["je.suppose@que.cesont.des.email"];

It will display the number of times the email "je.suppose@que.cesont.des.email" is found.
--
-= Bobinours - http://bobin.underlands.org =-

bheadman · Answer

I hope he has found his solution by now ^^,
but basically the idea (in Perl) is to use hash arrays.
you need to enter the desired values as keys and count the occurrences of each key (the number of times it appears in the initial list). In Perl, a hash array can only have one key value per array (no duplicates in a way), which solves the problem.

There you go, I think that's what is proposed in the last post (it's been a while since I've touched Perl) but that's the avenue to explore.

Best regards.
Nicolas

Duplicate Removal - PERL

6 answers

Site unreachable

Unrefreshed usb drive on android interface

Reinstallation of w11 impossible after pc crash...

Tencent impossible to remove

Is there enough storage space on my vps?

Car gps tracker

Transfer mp3 from pc to mobile

Erreur de captcha

How to delete quarantine files in defender?

Issue with accents