I have a csv file of my study area and I want to choose best area in this large study area to run my modelling for hydrogeology which is based on maximum event precipitation.
I have the code and I can indexing and get the value by thresholding.
For example: I choose index 1450 and show the value more than 100.
Problem is how can I identify the values that I want and get rid of the useless data, delete them and save it to new csv file?
Or how can I just display the valuable data with dates and pointid of that value?
I want to show me the value based on the date and count of how many times that event happened.
%pylab inline
import pandas as pd
import numpy as np
from scipy import stats
import time
data_all= pd.read_csv('D:/Project_by_tiff_SA/CSV file/GPM2points_SA.csv')
rain_all=data_all[data_all.columns[pd.Series(data_all.columns).str.startswith('D_')]]
#Drop some fields
rain_drop=data_all.drop(["OID_","pointid","grid_code","Mean","Max"],axis=1)
#Transpose Data
trans_all=rain_drop.T
trans_all
#If I index like this I'd get the NAN table
#I need the values to be shown based on date for whole 17000 points
thresh=trans_all[64500][trans_all[64500]>100]
thresh
#I was wondering if it's correct way to show value counts or not.
trans_all[64500][trans_all[64500]>500].value_counts()
#To save the result
df.to_csv("H123.csv",index=True)
I have the csv file for 17000 points but I will be attached the light version
Attachment: It can also be done with the test data with only 24pixels. I'll attach the light version.