You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running the README.md steps on the Intel DevCloud. I generated full_under_200.txt both with the julia script get_proteins_under_200aa.jl and julia_get_proteins_under_200aa.ipynb for good measure. A diff says files are different but they look the same (tab separated values).
In the DevCloud environment, when I run angle_data_preparation_py.ipynb, I get an error when extracting data from text:
# Scan first n proteins
names = []
seqs = []
psis = []
phis = []
pssms = []
(...)
ValueError: could not convert string to float: '0.0\ (...)
Which can be suppresed by changing function parse_lines(raw) to:
# Helper functions to extract numeric data from text
def parse_lines(raw):
# added tab \t to suppress previous error
return np.array([[float(x) for x in line.split("\t") if x != ""] for line in raw])
(...)
That gets passed the first error, but then throws another one further down:
(...)
---> 10 outputs.append([phis[i][j], psis[i][j]])
11 # break
12 # print(i, "Added: ", len(seqs[i])-34,"total for now: ", long)
IndexError: list index out of range
Which I suspect has someting to do with one of the previous outputs, and the features' "n. prots" not being the same:
# Ensure all features have same n. prots
print("Names: ", len(names))
print("Seqs: ", len(seqs))
print("PSSMs: ", len(pssms))
print("Phis: ", len(phis))
print("Psis: ", len(psis))
Names: 601
Seqs: 600
PSSMs: 600
Phis: 0
Psis: 0
Any suggestions on what could be wrong in parsing the full_under_200.txt file?
The text was updated successfully, but these errors were encountered:
Hi there,
I don't know exactly what might be causing the error. I suggest manually debugging it (try to see the conflicting protein, inspecting the lengths of psi and psi lists, ...)
I am running the README.md steps on the Intel DevCloud. I generated full_under_200.txt both with the julia script get_proteins_under_200aa.jl and julia_get_proteins_under_200aa.ipynb for good measure. A diff says files are different but they look the same (tab separated values).
In the DevCloud environment, when I run angle_data_preparation_py.ipynb, I get an error when extracting data from text:
Which can be suppresed by changing function parse_lines(raw) to:
That gets passed the first error, but then throws another one further down:
Which I suspect has someting to do with one of the previous outputs, and the features' "n. prots" not being the same:
Any suggestions on what could be wrong in parsing the full_under_200.txt file?
The text was updated successfully, but these errors were encountered: