{ "cells": [ { "cell_type": "code", "execution_count": 87, "id": "036980a4-7309-4ea4-8dac-e8901c4525cf", "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "id": "ba89b925-5fc4-41bc-8b5e-81e53cfe1e4e", "metadata": {}, "source": [ "# Getting the client counseling sessions data\n", "---\n", "The data can be obtained from this filter. You will need to break it up into smaller chunks and export them into the folder client_counseling\n", "https://pasbdc.neoserra.com/activity/list/10?__formid=10&remove=&savename=&sort=DATE&sortdir=DESC&expr=&field_1=DATE&opt_auto_1=pfy&field_2=CLI_TYPE2&opt_2=&opt_2=AC&opt_2=IC&field_3=F_CENTER_ID&opt_3=2805&opt_3=2790&opt_3=2782&opt_3=2784&opt_3=2806&opt_3=2789&opt_3=4491&opt_3=2783&opt_3=2807&opt_3=2809&opt_3=2788&opt_3=2780&opt_3=2808&opt_3=2786&opt_3=2785&opt_3=2787&opt_3=2791&opt_3=2781&field_4=&sortdir=DESC\n", "\n", "Then combine the csvs into one big file" ] }, { "cell_type": "code", "execution_count": 88, "id": "7c3bf1e4-57bc-419b-909d-fca0043c1df7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 828614 entries, 0 to 828613\n", "Data columns (total 9 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Session Date 828614 non-null object \n", " 1 Client 828614 non-null object \n", " 2 Client ID 828614 non-null object \n", " 3 Counselor 823892 non-null object \n", " 4 Session Type 828611 non-null object \n", " 5 Contact Type 828614 non-null object \n", " 6 Center 828614 non-null object \n", " 7 Prep+Contact 828614 non-null float64\n", " 8 Total Hours 828614 non-null float64\n", "dtypes: float64(2), object(7)\n", "memory usage: 56.9+ MB\n" ] } ], "source": [ "sessions_df = pd.read_csv('counselling_sessions_fy2225.csv')\n", "sessions_df.info()" ] }, { "cell_type": "markdown", "id": "acb44859-ef4f-4dcb-82a9-80e5c30ce778", "metadata": {}, "source": [ "# Get the unique clients list\n", "---\n", "Download and load the unique clients list with these columns\n", "https://pasbdc.neoserra.com/clients?__formid=3&remove=&savename=&sort=CLIENT_ID&sortdir=ASC&expr=&field_1=REVIEWID&opt_1=13213656&field_2=&sortdir=ASC" ] }, { "cell_type": "code", "execution_count": 90, "id": "4563ea40-4d94-41c5-86f4-6d620a45c1de", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0Client IDClientPrimary ContactLast CounselingPhoneEmailPhysical AddressPhysical Address CountyPhysical Address StatePrimary NAICSNAICsNAICS_2PA NAICs Code PercentagePASBDC NAICs Code Percentage
00WD04170\\tProinnov@ LLC (WD04170)Jardenson Castro9/9/2025 12:00 AM(267) 748-4465JardensonC@ICLOUD.com6752 Oakland St.PhiladelphiaPennsylvaniaNaNNaN0.00.00000014.915377
11WD02759\"C.J.A.\"/ Crawley Jones and Allen real estate...mark crawley10/20/2025 12:00 AM(215) 290-9828mrkcrawley@gmail.com673 Rively aveDelawarePennsylvania531390 - Other Activities Related to Real Esta...531390-OtherActivitiesRelatedtoRealEstate\\r\\r\\...53.02.5101272.688026
22PS018402Anjie's Cleaning Bees (PS018402)Anjelica Gonzez10/14/2024 12:00 AM(717) 521-3625anjelicagonzalez2001@gmail.com1129 High StLycomingPennsylvania561720 - Janitorial Services \\r\\r\\n561720-JanitorialServices\\r\\r\\n\\r\\r\\n56.03.6056474.344285
33C8538BRENIMAN PROPERTIES, LLC (C8538)RYAN BRENIMAN10/17/2025 12:00 AMNaNr_breniman@yahoo.com147 Heeter RdClarionPennsylvania531120 - Lessors of Nonresidential Buildings (...531120-LessorsofNonresidentialBuildings(except...53.02.5101272.688026
44BU016079Civil War Cider Co., Inc. (BU016079)Robert Antanitis, II10/21/2024 12:00 AM(570) 523-3414rob@civilwarcider.com606 Market St.UnionPennsylvania312130 - Wineries \\r\\r\\n312130-Wineries\\r\\r\\n\\r\\r\\n31.02.8763044.923522
\n", "
" ], "text/plain": [ " Unnamed: 0 Client ID Client \\\n", "0 0 WD04170 \\tProinnov@ LLC (WD04170) \n", "1 1 WD02759 \"C.J.A.\"/ Crawley Jones and Allen real estate... \n", "2 2 PS018402 Anjie's Cleaning Bees (PS018402) \n", "3 3 C8538 BRENIMAN PROPERTIES, LLC (C8538) \n", "4 4 BU016079 Civil War Cider Co., Inc. (BU016079) \n", "\n", " Primary Contact Last Counseling Phone \\\n", "0 Jardenson Castro 9/9/2025 12:00 AM (267) 748-4465 \n", "1 mark crawley 10/20/2025 12:00 AM (215) 290-9828 \n", "2 Anjelica Gonzez 10/14/2024 12:00 AM (717) 521-3625 \n", "3 RYAN BRENIMAN 10/17/2025 12:00 AM NaN \n", "4 Robert Antanitis, II 10/21/2024 12:00 AM (570) 523-3414 \n", "\n", " Email Physical Address Physical Address County \\\n", "0 JardensonC@ICLOUD.com 6752 Oakland St. Philadelphia \n", "1 mrkcrawley@gmail.com 673 Rively ave Delaware \n", "2 anjelicagonzalez2001@gmail.com 1129 High St Lycoming \n", "3 r_breniman@yahoo.com 147 Heeter Rd Clarion \n", "4 rob@civilwarcider.com 606 Market St. Union \n", "\n", " Physical Address State Primary NAICS \\\n", "0 Pennsylvania NaN \n", "1 Pennsylvania 531390 - Other Activities Related to Real Esta... \n", "2 Pennsylvania 561720 - Janitorial Services \\r\\r\\n \n", "3 Pennsylvania 531120 - Lessors of Nonresidential Buildings (... \n", "4 Pennsylvania 312130 - Wineries \\r\\r\\n \n", "\n", " NAICs NAICS_2 \\\n", "0 NaN 0.0 \n", "1 531390-OtherActivitiesRelatedtoRealEstate\\r\\r\\... 53.0 \n", "2 561720-JanitorialServices\\r\\r\\n\\r\\r\\n 56.0 \n", "3 531120-LessorsofNonresidentialBuildings(except... 53.0 \n", "4 312130-Wineries\\r\\r\\n\\r\\r\\n 31.0 \n", "\n", " PA NAICs Code Percentage PASBDC NAICs Code Percentage \n", "0 0.000000 14.915377 \n", "1 2.510127 2.688026 \n", "2 3.605647 4.344285 \n", "3 2.510127 2.688026 \n", "4 2.876304 4.923522 " ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "client_df = pd.read_csv('naics_client_list_tagged.csv')\n", "client_df.head()" ] }, { "cell_type": "code", "execution_count": 91, "id": "9307bc7a-8ba1-4c7f-b7eb-e2b5f6d4c8d7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Session Typecount
0Administrative476099
1Follow-up316915
2Initial/New17024
3Close-out14902
4Impact3671
\n", "
" ], "text/plain": [ " Session Type count\n", "0 Administrative 476099\n", "1 Follow-up 316915\n", "2 Initial/New 17024\n", "3 Close-out 14902\n", "4 Impact 3671" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session_type_value_counts = sessions_df['Session Type'].value_counts().reset_index()\n", "session_type_value_counts" ] }, { "cell_type": "code", "execution_count": 92, "id": "97465a49-4e13-4e23-88ba-6bd7d72c6520", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Session TypeClient IDCenterAdministrativeClose-outFollow-upImpactInitial/New
0D 14632Duquesne University SBDC00201
1S015040The University of Scranton SBDC01700
200000043TE - TEMPLE SBDC10000
300000052WD - WIDENER SBDC30800
4000002Kutztown University SBDC20000
\n", "
" ], "text/plain": [ "Session Type Client ID Center Administrative \\\n", "0 D 14632 Duquesne University SBDC 0 \n", "1 S015040 The University of Scranton SBDC 0 \n", "2 00000043 TE - TEMPLE SBDC 1 \n", "3 00000052 WD - WIDENER SBDC 3 \n", "4 000002 Kutztown University SBDC 2 \n", "\n", "Session Type Close-out Follow-up Impact Initial/New \n", "0 0 2 0 1 \n", "1 1 7 0 0 \n", "2 0 0 0 0 \n", "3 0 8 0 0 \n", "4 0 0 0 0 " ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session_counts = sessions_df.groupby('Client ID')['Session Type'].value_counts()\n", "client_centers = sessions_df.groupby('Client ID')['Center'].first()\n", "\n", "\n", "unified_counts_df = session_counts.unstack(fill_value=0)\n", "unified_counts_df['Center'] = client_centers\n", "\n", "\n", "\n", "all_cols = list(unified_counts_df.columns)\n", "all_cols.remove('Center')\n", "all_cols.insert(0, 'Center')\n", "\n", "unified_counts_df = unified_counts_df[all_cols].reset_index()\n", "unified_counts_df.head()" ] }, { "cell_type": "code", "execution_count": 93, "id": "0eb344e3-a1e2-4ae8-b2ad-a40cc775eec9", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Center\n", "University of Pittsburgh SBDC 11263\n", "TE - TEMPLE SBDC 8851\n", "Kutztown University SBDC 8454\n", "WD - WIDENER SBDC 4452\n", "The University of Scranton SBDC 3798\n", "K - Kutztown SBDC 2608\n", "PennWest University Clarion SBDC 2574\n", "WI - WILKES SBDC 2387\n", "LE - LEHIGH UNIVERSITY SBDC 2211\n", "G - GANNON SBDC 1794\n", "Penn State SBDC 1782\n", "SH - SHIPPENSBURG SBDC 1751\n", "Duquesne University SBDC 1604\n", "Bucknell SBDC 1149\n", "SF - ST. FRANCIS UNIVERSITY SBDC 1033\n", "SV - ST. VINCENT COLLEGE SBDC 738\n", "G - Meadville 186\n", "SV - Fayette Outreach 176\n", "G - Mercer 131\n", "Indiana County 125\n", "Clarion CARES Act 20\n", "G - Warren 1\n", "SC - Monroe Outreach 1\n", "LE - Bucks County/Lehigh SBDC 1\n", "Name: count, dtype: int64" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "unified_counts_df['Center'].value_counts()" ] }, { "cell_type": "code", "execution_count": 94, "id": "e38f57b8-f362-4bba-ae20-b858f2be4504", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Center\n", "Pittsburgh 11263\n", "Temple 8851\n", "Kutztown University SBDC 8454\n", "Widner 4452\n", "Scranton 3798\n", "Kutztown 2608\n", "Clarion 2574\n", "Wilkes 2387\n", "Lehigh 2212\n", "Gannon 1794\n", "Penn State 1782\n", "Shippensburg 1751\n", "Duquesne 1604\n", "Bucknell 1149\n", "St. Francis 1033\n", "St. Vincent 738\n", "G - Meadville 186\n", "SV - Fayette Outreach 176\n", "G - Mercer 131\n", "Indiana County 125\n", "Clarion CARES Act 20\n", "G - Warren 1\n", "SC - Monroe Outreach 1\n", "Name: count, dtype: int64" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "center_mapping = {\n", " \"University of Pittsburgh SBDC\":\"Pittsburgh\",\n", " \"TE - TEMPLE SBDC\":\"Temple\",\n", " \"Kutztown University SBDC\": \"Kutztown\",\n", " \"K - Kutztown SBDC\":\"Kutztown\",\n", " \"WD - WIDENER SBDC\": \"Widner\",\n", " \"The University of Scranton SBDC\": \"Scranton\",\n", " \"PennWest University Clarion SBDC\":\"Clarion\",\n", " \"WI - WILKES SBDC\":\"Wilkes\",\n", " \"LE - LEHIGH UNIVERSITY SBDC\":\"Lehigh\",\n", " \"G - GANNON SBDC\":\"Gannon\",\n", " \"Penn State SBDC\":\"Penn State\",\n", " \"SH - SHIPPENSBURG SBDC\":\"Shippensburg\",\n", " \"Duquesne University SBDC\":\"Duquesne\",\n", " \"Bucknell SBDC\":\"Bucknell\",\n", " \"SF - ST. FRANCIS UNIVERSITY SBDC\": \"St. Francis\",\n", " \"SV - ST. VINCENT COLLEGE SBDC\":\"St. Vincent\",\n", " \"LE - Bucks County/Lehigh SBDC\":\"Lehigh\",\n", "}\n", "\n", "unified_counts_df['Center'] = unified_counts_df['Center'].replace(center_mapping)\n", "unified_counts_df['Center'].value_counts()" ] }, { "cell_type": "code", "execution_count": null, "id": "39772f4d-0c9b-40cb-ab7a-0d3e152bd7a3", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "369dddf9-4b1a-47ea-aa08-cc0404c0f60a", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" } }, "nbformat": 4, "nbformat_minor": 5 }