Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
M
model_mvp
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
decision-science
model_mvp
Commits
042a76e7
Commit
042a76e7
authored
Apr 15, 2019
by
linfang.wang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
覆盖率
parent
b8eae307
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
5 deletions
+5
-5
datacal.py
data/analyis/datacal.py
+5
-5
No files found.
data/analyis/datacal.py
View file @
042a76e7
...
...
@@ -147,7 +147,7 @@ def cal_miss(df,feature,classes=[]):
tmp
.
loc
[
tmp
[
feature
]
==
0
,
'flag'
]
=
'0值'
tmp
.
loc
[
tmp
[
feature
]
>
0
,
'flag'
]
=
'非0值'
headers
=
classes
+
[
'flag'
,
'cnt'
,
'm
iss
_rate'
]
headers
=
classes
+
[
'flag'
,
'cnt'
,
'm
atch
_rate'
]
if
len
(
classes
)
>
0
:
# == 分类型
df_gp
=
pd
.
merge
(
...
...
@@ -155,12 +155,12 @@ def cal_miss(df,feature,classes=[]):
tmp
.
groupby
(
classes
+
[
'flag'
])[
feature
]
.
count
()
.
reset_index
()
.
rename
(
columns
=
{
feature
:
"cnt1"
}),
on
=
classes
,
how
=
'left'
)
df_gp
[
'm
iss_rate'
]
=
np
.
round
(
1
-
df_gp
.
cnt1
/
df_gp
.
cnt
,
3
)
df_gp
[
'm
atch_rate'
]
=
np
.
round
(
df_gp
.
cnt1
/
df_gp
.
cnt
,
3
)
df_out
=
df_gp
else
:
all
=
[[
'非0值'
,
tmp
.
shape
[
0
],
round
(
1
-
tmp
[
tmp
[
feature
]
>
0
]
.
shape
[
0
]
/
tmp
.
shape
[
0
],
3
)],
[
'0值'
,
tmp
.
shape
[
0
],
round
(
1
-
tmp
[
tmp
[
feature
]
==
0
]
.
shape
[
0
]
/
tmp
.
shape
[
0
],
3
)],
[
'缺失值'
,
tmp
.
shape
[
0
],
round
(
1
-
tmp
[(
tmp
[
feature
]
<
0
)]
.
shape
[
0
]
/
tmp
.
shape
[
0
],
3
)]]
all
=
[[
'非0值'
,
tmp
.
shape
[
0
],
round
(
tmp
[
tmp
[
feature
]
>
0
]
.
shape
[
0
]
/
tmp
.
shape
[
0
],
3
)],
[
'0值'
,
tmp
.
shape
[
0
],
round
(
tmp
[
tmp
[
feature
]
==
0
]
.
shape
[
0
]
/
tmp
.
shape
[
0
],
3
)],
[
'缺失值'
,
tmp
.
shape
[
0
],
round
(
tmp
[(
tmp
[
feature
]
<
0
)]
.
shape
[
0
]
/
tmp
.
shape
[
0
],
3
)]]
df_all
=
pd
.
DataFrame
(
all
,
columns
=
headers
)
df_out
=
df_all
return
df_out
[
headers
]
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment