Skip to content

Commit

Permalink
Merge pull request #7 from codeclassroom/dev
Browse files Browse the repository at this point in the history
0.4 release
  • Loading branch information
Bhupesh-V authored Mar 10, 2020
2 parents a796b3f + 7f1aabc commit 4b31e40
Show file tree
Hide file tree
Showing 12 changed files with 172 additions and 86 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# Changelog


## [0.4] - March 10, 2020

### Changed [⚠️ Breaking Changes]
- `getShareScores` & `getInsights` have been decoupled from the check class, they now have to imported separately.
- Minor changes in the `analyze.py` module.


## [0.3] - Jan 1, 2020

### Added
Expand Down
20 changes: 12 additions & 8 deletions demo.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,30 @@
"""Usage example"""
import os
import pprint
from plagcheck import plagcheck
from plagcheck.plagcheck import check, insights, share_scores

from dotenv import load_dotenv
load_dotenv()

language = "python"
language = "java"
userid = os.environ["USER_ID"]


moss = plagcheck.check(language, userid)
moss = check(language, userid)

moss.addFilesByWildCard("testfiles/test_python*.py")
moss.addFilesByWildCard("testfiles/test_java*.java")

# or moss.addFile("testfiles/test_python.py")

moss.submit()

print(moss.getHomePage())
pprint.pprint(moss.getResults())
# print frequency of each shared solution
pprint.pprint(moss.getShareScores())

result = moss.getResults()

pprint.pprint(result)

# print potential distributor-culprit relationships
pprint.pprint(moss.getInsights())
pprint.pprint(insights(result))
# print frequency of each shared solution
pprint.pprint(share_scores(result))
7 changes: 7 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# Changelog


## [0.4] - March 10, 2020

### Changed [⚠️ Breaking Changes]
- `getShareScores` & `getInsights` have been decoupled from the check class, they now have to imported separately.
- Minor changes in the `analyze.py` module.


## [0.3] - Jan 1, 2020

### Added
Expand Down
51 changes: 51 additions & 0 deletions docs/insights.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Insights

PlagCheck provides algorithmic analysis of Moss results.

### Terminologies

### 1. Node
Nodes are results returned by Moss i.e every
individual file.

### 2. Tags
Tags are roles which a file serves i.e. a tag is
a potential distributor or potential culprit or
both.

### 3. M-group
m-groups (moss-groups) are groups of solution which have similar code.
For example A student who solves a programming problem may share their
solution with 3 of his/her friends, that is a single m-group with 4 nodes.

For example if you run [demo.py](https://github.com/codeclassroom/PlagCheck/blob/master/demo.py), `insights()` will return the following data:
```java

{'DCtoC Paths': [('testfiles/test_java5.java', 'testfiles/test_java2.java'),
('testfiles/test_java4.java', 'testfiles/test_java2.java')],
'DtoC Paths': [('testfiles/test_java3.java', 'testfiles/test_java2.java'),
('testfiles/test_java3.java', 'testfiles/test_java.java'),
('testfiles/test_java7.java', 'testfiles/test_java6.java')],
'DtoDC Paths': [('testfiles/test_java3.java', 'testfiles/test_java5.java'),
('testfiles/test_java3.java', 'testfiles/test_java4.java')]}

```

This analysis can be visualized into following _Disconnected Directed Graph_

![moss results](https://drive.google.com/uc?export=view&id=1Lc8obgjihfo7EGimn300mTtqfmHK0Zem)

We assign Tags to every individual Node.

1. D - Distributor
Student(s) who distributed their
code in a group.
2. C - Culprit
Student(s) who copied the shared
code.
3. DC - Both a Distributor & Culprit

In the above depicted graph, there are 2 unique _m-groups_.

1. Group 1 : [1, 2, 3, 4, 5]
2. Group 2 : [7, 6]
18 changes: 13 additions & 5 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,24 @@

Installing plagcheck is pretty simple, just run

`pip install plagcheck`
```bash
pip install plagcheck
```

Install a specific verison

`pip install plagcheck==0.2`
```bash
pip install plagcheck==0.4
```

or directly from GitHub if you cannot wait to test new features

`pip install git+https://github.com/codeclassroom/PlagCheck.git`
```bash
pip install git+https://github.com/codeclassroom/PlagCheck.git
```

If you have already installed it and want to update
If you have a old version, update it using

`pip install --upgrade plagcheck`
```bash
pip install --upgrade plagcheck
```
55 changes: 30 additions & 25 deletions docs/usage.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Usage

plagcheck provides the following classes:
plagcheck provides the following classes & methods:

### check(files, lang, user_id)

Expand All @@ -16,29 +16,33 @@ plagcheck provides the following classes:
"""Usage example"""
import os
import pprint
from plagcheck import plagcheck
from plagcheck.plagcheck import check, insights, share_scores

from dotenv import load_dotenv
load_dotenv()

language = "python"
language = "java"
userid = os.environ["USER_ID"]


moss = plagcheck.check(language, userid)
moss = check(language, userid)

moss.addFilesByWildCard("testfiles/test_python*.py")
moss.addFilesByWildCard("testfiles/test_java*.java")

# or moss.addFile("testfiles/test_python.py")

moss.submit()

print(moss.getHomePage())
pprint.pprint(moss.getResults())
# print frequency of each shared solution
pprint.pprint(moss.getShareScores())

result = moss.getResults()

pprint.pprint(result)

# print potential distributor-culprit relationships
pprint.pprint(moss.getInsights())
pprint.pprint(insights(result))
# print frequency of each shared solution
pprint.pprint(share_scores(result))

```

Expand Down Expand Up @@ -72,18 +76,6 @@ c.getHomePage()
```python

c.getResults()
"""
[
{
"file1":"filename1.py",
"file2":"filename2.py",
"percentage": 34,
"no_of_lines_matched": 3,
"lines_matched":[["2-3", "10-11"]]
},
....
]
"""

```

Expand Down Expand Up @@ -162,14 +154,16 @@ program code that also appears in the base file is not counted in matches.
code for an assignment. Multiple Base files are allowed.
- You should use a base file if it is convenient; base files improve results, but are not usually necessary for obtaining useful information.

### 7. getShareScores()
**Parameters** : `None` <br>
<hr>

### share_scores()
**Parameters** : `Moss Results`(returned by `getResults()`) <br>
**Return Type** : `Dict` <br>
**Description**: Share Score is a utility which returns frequency of every individual file.<br>
**Demo**:
```python

c.getShareScores()
print(share_scores(moss_data))

# Will return
"""
Expand All @@ -179,4 +173,15 @@ c.getShareScores()
"""
```
Share Score is basically the frequency of each file appearing in Moss Results.
i.e Higher the frequency, the more is that solution "shared" by different files.
i.e Higher the frequency, the more is that solution "shared" by different files.

### insights()
**Parameters** : `Moss Results`(returned by `getResults()`) <br>
**Return Type** : `Dict` <br>
**Description**: See [Insights](/insights).<br>
**Demo**:
```python

print(insights(moss_data))

```
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ nav:
- Documentation: index.md
- Installation: installation.md
- Usage: usage.md
- PlagCheck Insights: insights.md
- Moss: moss.md
- Changelog: changelog.md
- About: about.md
Expand Down
2 changes: 1 addition & 1 deletion plagcheck/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
"""The MOSS interface package for CodeClassroom"""
from plagcheck.plagcheck import check
from plagcheck.plagcheck import check, insights, share_scores
2 changes: 1 addition & 1 deletion plagcheck/analyze.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def __init__(self):
self.nodes = []
self.nodeCount = 0

def relatesTo(self, P1, P2, node1, node2):
def relate(self, P1, P2, node1, node2):
"""Set a path between two file nodes"""
node_obj_dict = {}

Expand Down
76 changes: 39 additions & 37 deletions plagcheck/plagcheck.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,45 @@ def request(url: str):
return req.decode("utf-8")


def share_scores(moss_data: dict) -> dict:
"""Share Score Insights"""
similar_code_files = []
for result in moss_data:
similar_code_files.append(result["file1"])
similar_code_files.append(result["file2"])

# frequency of files which are similar
share_score = collections.Counter(similar_code_files)

return dict(share_score)


def insights(moss_data: dict) -> dict:
"""Analysis for Moss"""
mg = Mgroups()
similar_code_files = set()
insights = {}

for r in moss_data:
similar_code_files.add(r["file1"])
similar_code_files.add(r["file2"])

mg.createNodes(similar_code_files)

for r in moss_data:
mg.relate(
r["percentage_file1"], r["percentage_file2"], r["file1"], r["file2"]
)

mg.set_tags()

insights["DtoC Paths"] = mg.d2c()
insights["DtoDC Paths"] = mg.d2dc()
insights["DCtoC Paths"] = mg.dc2c()

return insights


class check:
"""
Args:
Expand Down Expand Up @@ -133,40 +172,3 @@ def getResults(self) -> Tuple[str, Results]:
"""Return the result as a list of dictionary"""

return self.moss_results

def getShareScores(self):
"""Share Score Insights"""
similar_code_files = []
for result in self.moss_results:
similar_code_files.append(result["file1"])
similar_code_files.append(result["file2"])

# frequency of files which are similar
share_score = collections.Counter(similar_code_files)

return dict(share_score)

def getInsights(self):
"""Analysis for Moss"""
mg = Mgroups()
similar_code_files = set()
insights = {}

for r in self.moss_results:
similar_code_files.add(r["file1"])
similar_code_files.add(r["file2"])

mg.createNodes(similar_code_files)

for r in self.moss_results:
mg.relatesTo(
r["percentage_file1"], r["percentage_file2"], r["file1"], r["file2"]
)

mg.set_tags()

insights["DtoC Paths"] = mg.d2c()
insights["DtoDC Paths"] = mg.d2dc()
insights["DCtoC Paths"] = mg.dc2c()

return insights
Loading

0 comments on commit 4b31e40

Please sign in to comment.