Select to view content in your preferred language

Second largest value from list

18431
5
Jump to solution
02-17-2015 07:14 AM
DaveMiller
Emerging Contributor

Hi folks,

 

I know how to use the inbuilt python function to derive the maximum value across a number of input fields.

 

What I am now trying to do is to derive the second largest value across a number of input fields, and to retain the name of the that field.

 

For example if I have a set of fields as below I'd like to return "Area 3" and 15 from this list...

 

Area 1 = 1

Area 2  = 5

Area 3 = 15

Area 4 = 16

Area 5 = 10

 

Any ideas? Is there some sort of combination of rank or sort I can use that will give me the answer?

 

Cheers,

Dave

0 Kudos
1 Solution

Accepted Solutions
DanPatterson_Retired
MVP Emeritus

If you know python and can adapt this, follow the example

>>> a = [3,5,2,4,1]   # take your list
>>> a.sort()          # sort it
>>> a                 # have a look-see
[1, 2, 3, 4, 5]
>>> a[-1]  # max      # get the max by indexing from the end
5
>>> a[-2]             # 2nd largest...same old idea
4
>>>

View solution in original post

5 Replies
DanPatterson_Retired
MVP Emeritus

If you know python and can adapt this, follow the example

>>> a = [3,5,2,4,1]   # take your list
>>> a.sort()          # sort it
>>> a                 # have a look-see
[1, 2, 3, 4, 5]
>>> a[-1]  # max      # get the max by indexing from the end
5
>>> a[-2]             # 2nd largest...same old idea
4
>>>
DaveMiller
Emerging Contributor

Thanks Dan - will give this a whirl.

It looks straightforward enough!

Dave

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Regarding Dan Patterson​'s suggestion, it is likely the most straightforward or simplest since it relies on using built-in list methods and slicing, but there are still some things you should think about.

One, can the second-maximum item be the same as the maximum?

>>> a = [5, 3, 5, 4, 1, 2]
>>> a.sort()
>>> a[-1]
5
>>> a[-2]
5

If not, collapsing the list into a set is one way to address the issue.

>>> a = [5, 3, 5, 4, 1, 2]
>>> a = list(set(a))
>>> a.sort()
>>> a[-1]
5
>>> a[-2]
4

Second, sorting lists in Python is generally O(n log n) while getting max or min is O(n) (TimeComplexity - Python Wiki) .  If you are working with large lists, especially extremely large ones, the overhead of sorting the list to find the second highest maximum won't be trivial.  If you are working with large lists and performance matters, there is a good discussion thread over at stackoverflow on finding the second largest value:  Get the second largest number in a list in linear time.

DanPatterson_Retired
MVP Emeritus

Joshua Bixby​  It is the statistician in me...one never removes duplicates from a list since all observations are equal...I was simply accessing via slicing, the second largest in a list without changing the length of the list...which if you did...would mean that you would be working with a different list and not the one in question. And your question would have to be re-posed     

>>> a = [5,5,5,5,5,5]
>>> a = [5,5,5,5,5]
>>> a[-2]
5
>>> N = len(a)
>>> b = list(set(a))
>>> b[-2]
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
IndexError: list index out of range
>>> a[-2]
5
>>> b == a
False
>>>
JamesCrandall
MVP Frequent Contributor

I can see the need to deal with duplicate "second highest" values being found.  If you need maintain those (ie. keep the duplicates) then the pandas Data Frame might be of use.

Just for an example I created a .csv from the OP's data source sample and added an extra row that is a duplicate of Area 3 (the 2nd highest value):

Area 1 = 1

Area 2  = 5

Area 3 = 15

Area 4 = 16

Area 5 = 10

Area 6 = 15

Edit: this is far more simple than I had originally posted.  Just overlooked some of the powerful capability of this library.  Just simply determine the second highest value then return the rows equal to that value:

data = r'H:\RankData.csv'
df = pd.io.parsers.read_table(data, sep=',')
secondval = df['Values'].max()
new_frame = df[df['Values'] == secondval-1]
print new_frame
0 Kudos